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(54) Title: NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



^ (57) Abstract: The present invention provides novel nucleic acids, novel polypeptide sequences encoded by these nucleic acids and 
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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by 
5 such polynucleotides, along with uses for these polynucleotides and proteins, for example 
in therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, 
10 such as lymphokines, interferons, CSFs, chemokines, and interleukins) has matured 
rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 
information directly related to the discovered protein (i.e., partial DNA/amino acid 
sequence of the protein in the case of hybridization cloning; activity of the protein in the 
15 case of expression cloning). More recent "indirect" cloning techniques such as signal 
sequence cloning, which isolates DNA sequences based on the presence of a now 
well-recognized secretory leader sequence motif, as well as various PCR-based or low 
stringency hybridization-based cloning techniques, have advanced the state of the art by 
making available large numbers of DNA/amino acid sequences for proteins that are 
20 known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of 
PCR-based techniques, or by virtue of structural similarity to other genes of known 
biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications 
25 in, for example, diagnostics, forensics, gene mapping; identification of mutations 
responsible for genetic disorders or other traits, to assess biodiversity, and to produce 
many other types of data and products dependent on DNA and amino acid sequences. 
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3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA 
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molecules, cloned genes or degenerate variants thereof, especially naturally occurring 
variants such as allelic variants, antisense polynucleotide molecules, and antibodies that 
specifically recognize one or more epitopes present on such polypeptides, as well as 
hybridomas producing such antibodies. 
5 The compositions of the present invention additionally include vectors, including 

expression vectors, containing the polynucleotides of the invention, cells genetically 
engineered to contain such polynucleotides and cells genetically engineered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic 
10 acid sequence assembled from expressed sequence tags (ESTs) isolated mainly by 

sequencing by hybridization (SBH), and in some cases, sequences obtained from one or 
more public databases. The invention relates also to the proteins encoded by such 
polynucleotides, along with therapeutic, diagnostic and research utilities for these 
polynucleotides and proteins. These nucleic acid sequences are designated as SEQ ID NO: 
15 1 - 438 and are provided in the Sequence listing. In the nucleic acids provided in the 
Sequence Listing, A is adenine; C is cytosine; G is guanine; T is thymine; and N is any of 
the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stopcodon. 

The nucleic acid sequences of the present invention also include, nucleic acid 
20 sequences that hybridize to the complement of SEQ ID NO: 1 - 438 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences 
that encode a peptide comprising a specific domain or truncation of the peptides encoded by 
SEQ ID NO: 1 - 438. A polynucleotide comprising a nucleotide sequence haying at least 
25 90% identity to an identifying sequence of SEQ ID NO: 1 - 438 or a degenerate variant or 
fragment thereof. The identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1 - 438. The sequence 
information can be a segment of any one of SEQ ID NO: 1 - 438 that uniquely identifies or 
30 represents the sequence information of SEQ ID NO: 1 - 438. 
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A collection as used in this application can be a collection of only one 
polynucleotide. The collection of sequence information or identifying information of each 
sequence can be provided on a nucleic acid array. In one embodiment, segments of 
sequence information is provided on a nucleic acid array to detect the polynucleotide that 
5 contains the segment The array can be designed to detect full-match or mismatch to the 
polynucleotide that contains the segment. The collection can also be provided in a 
computer-readable format 

This invention also includes the reverse or direct complement of any of the nucleic 
acid sequences recited above; cloning or expression vectors containing the nucleic acid 

10 sequences; and host cells or organisms transformed with these expression vectors. Nucleic 
acid sequences (or their reverse or direct complements) according to the invention have 
numerous applications in a variety of techniques known to those skilled in the art of 
molecular biology, such as use as hybridization probes, use as primers for PGR, use in an 
array, use in computer-readable media, use in sequencing full-length genes, use for 

15 chromosome and gene mapping, use in the recombinant production of protein, and use in 
the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-438 or 
novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art In a particularly preferred embodiment, the 

20 nucleic acid sequences of SEQ ID NO: 1-438 or novel segments or parts of the nucleic acids 
provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence 
tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 

25 polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1- 
438; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO; 1-438; and a polynucleotide comprising any of the nucleotide sequences of the mature 
protein coding sequences of SEQ ID NO: 1-438. The polynucleotides of the present 
invention also include, but are not limited to, a polynucleotide that hybridizes under 

30 stringent hybridization conditions to (a) the complement of any one of the nucleotide 

sequences set forth in SEQ ID NO: 1-438; (b) a nucleotide sequence encoding any one of 



3 



WO 02/081731 



PCT/US02/01222 



the amino acid sequences set forth in the Sequence listing; (c) a polynucleotide which is an 
allelic variant of any polynucleotides recited above; (d) a polynucleotide which encodes a 
species homolog (e.g. orthologs) of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any 
5 of the polypeptides comprising an amino acid sequence set forth in the Sequence listing. 
The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising any of the amino acid sequences set forth in the Sequence listing; 
or die corresponding fiill length or mature protein. Polypeptides of die invention also include 
polypeptides with biological activity that are encoded by (a) any of the polynucleotides 

10 having a nucleotide sequence set forth in SEQ ID NO: 1-438; or (b) polynucleotides that 
hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically or immunologically active variants of any of the polypeptide 
sequences in the Sequence listing, and "substantial equivalents" thereof (e.g„ with at least 
about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) 

15 that preferably retain biological activity are also contemplated. The polypeptides of the 

invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the 
invention. Polypeptide compositions of the invention may further comprise an acceptable 

20 carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a 
polynucleotide of the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture 

25 medium under conditions permitting expression of the desired polypeptide, and purifying 
the polypeptide from the culture or from the host cells. Preferred embodiments include 
those in which the protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a 
variety of techniques known to those skilled in the art of molecular biology. These 

30 techniques include use as hybridization probes, use as oligomers, or primers, for PCR, 
use for chromosome and gene mapping, use in the recombinant production of protein, 
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and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. 
For example, when the expression of an mRNA is largely restricted to a particular cell or 
tissue type, polynucleotides of the invention can be used as hybridization probes to detect 
the presence of the particular cell or tissue mRNA in a sample using, e.g., in situ 
5 hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

10 The polypeptides according to the invention can be used in a variety of 

conventional procedures and methods that are currently applied to other proteins. For 
example, a polypeptide of the invention can be used to generate an antibody that 
specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, 
are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the 

15 invention can also be used as molecular weight markers, and as a food supplement 
Methods are also provided for preventing, treating, or ameliorating a medical 
condition which comprises the step of administering to a mammalian subject a 
therapeutically effective amount of a composition comprising a polypeptide of the 
present invention and a pharmaceutically acceptable carrier. 

20 In particular, the polypeptides and polynucleotides of the invention can be 

utilized, for example, in methods for the prevention and/or treatment of disorders 
involving aberrant protein expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for 

25 example, be utilized as part of prognostic and diagnostic evaluation of disorders as 
recited herein and for the identification of subjects exhibiting a predisposition to such 
conditions. The invention provides a method for detecting the polynucleotides of the 
invention in a sample, comprising contacting the sample with a compound that binds to 
and forms a complex with the polynucleotide of interest for a period sufficient to form 

30 the complex and under conditions sufficient to form a complex and detecting the complex 
such that if a complex is detected, the polynucleotide of interest is detected. The 
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invention also provides a method for detecting the polypeptides of the invention in a 
sample comprising contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex and detecting the formation of the complex such that if a complex is formed, the 
5 polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 
monoclonal antibodies, and optionally quantitative standards, for carrying out methods of 
the invention. Furthermore, the invention provides methods for evaluating the efficacy of 
drugs, and monitoring the progress of patients, involved in clinical trials for the treatment 

10 of disorders as recited above. 

The invention also provides methods for the identification of compounds that 
modulate (i.e., increase or decrease) the expression or activity of the polynucleotides 
and/or polypeptides of the invention. Such methods can be utilized, for example, for the 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 

IS Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 
invention provides a method for identifying a compound that binds to the polypeptides of 
the invention comprising contacting the compound with a polypeptide of the invention in 
a cell for a time sufficient to form a polypeptide/compound complex, wherein the 

20 complex drives expression of a reporter gene sequence in the cell; and detecting the 

complex by detecting the reporter gene sequence expression such that if expression of the 
reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified 

The methods of the invention also provides methods for treatment which involve 
25 the administration of the polynucleotides or polypeptides of the invention to individuals 
exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
treating diseases or disorders as recited herein comprising administering compounds and 
other substances that modulate the overall activity of the target gene products. 
Compounds and other substances can effect such modulation either on the level of target 
30 gene/protein expression or target protein activity. 
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The polypeptides of the present invention and the polynucleotides encoding them 
are also useful for the same functions known to one of skill in the art as the polypeptides 
and polynucleotides to which they have homology (set forth in Table 2); for which they 
have a signature region (as set forth in Table 3); or for which they have homology to a 
5 gene family (as set forth in Table 4). If no homology is set forth for a sequence, then the 
polypeptides and polynucleotides of the present invention are useful for a variety of 
i applications, as described herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 

10 

4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular 
forms "a", "an" and "the" include plural references unless the context clearly dictates 
otherwise. 

15 The term "active" refers to those forms of the polypeptide which retain the 

biologic and/or immunologic activities of any naturally occurring polypeptide. According 
to the invention, the terms "biologically active" or "biological activity" refer to a protein 
or peptide having structural, regulatory or biochemical functions of a naturally occurring 
molecule. Likewise "immunologically active" or "immunological activity" refers to the 

20 capability of the natural, recombinant or synthetic polypeptide to induce a specific 
immune response in appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are 
engaged in extracellular or intracellular membrane trafficking, including the export of 
secretory or enzymatic molecules as part of a normal or disease process. 

25 The terms "complementary" or "complementarity" refer to the natural binding of 

polynucleotides by base pairing. For example, the sequence S'-AGT-S' binds to the 
complementary sequence S'-TCA-S* . Complementarity between two single-stranded 
molecules may be "partial" such that only some of the nucleic acids bind or it may be 
"complete" such that total complementarity exists between the single stranded molecules. 

30 The degree of complementarity between the nucleic acid strands has significant effects on 
the efficiency and strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term 
w germ line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that 
provide a steady and continuous source of germ cells for the production of gametes. The 
5 term "primordial germ cells (PGCs)" refers to a small population of cells set aside from 
other cell lineages particularly from the yolk sac, mesenteries, or gonadal ridges during 
embryogenesis that have the potential to differentiate into germ cells and other cells. 
PGCs are the source from which GSCs and ES cells are derived The PGCs, the GSCs 
and the ES cells are capable of self-renewal. Thus these cells not only populate the germ 

10 line and give rise to a plurality of terminally differentiated cells that comprise the adult 
specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 
which modulates the expression of an operably linked ORE or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably 

15 linked sequence" when the expression of the sequence is altered by the presence of the 
EMF. EMFs include, but are not limited to, promoters, and promoter modulating 
sequences (inducible elements). One class of EMFs are nucleic acid fragments which 
induce the expression of an operably linked ORF in response to a specific regulatory 
factor or physiological event 

20 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic 
or synthetic origin which may be single-stranded or double-stranded and may represent 
the sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or 

25 RNA-like material. In the sequences herein A is adenine, C is cytosine, T is thymine, G 
is guanine and N is A, C, G or T (U). It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 
Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 

30 oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid 
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which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," 
or "segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
5 nucleotide residues which are at least about 5 nucleotides, more preferably at least about 
7 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 
1 1 nucleotides and most preferably at least about 17 nucleotides. The fragment is 
preferably less than about 500 nucleotides, preferably less than about 200 nucleotides, 
more preferably less than about 100 nucleotides, more preferably less than about 50 

10 nucleotides and most preferably less than 30 nucleotides. Preferably the probe is from 
about 6 nucleotides to about 200 nucleotides, preferably from about 15 to about 50 
nucleotides, more preferably from about 17 to 30 nucleotides and most preferably from 
about 20 to 25 nucleotides. Preferably the fragments can be used in polymerase chain 
reaction (PCR), various hybridization procedures or microarray procedures to identify or 

15 amplify identical or related parts of mRNA or DNA molecules. A fragment or segment 
may uniquely identify each polynucleotide sequence of the present invention. Preferably 
the fragment comprises a sequence substantially similar to any one of SEQ ID NOs:l- 
438. 

Probes may, for example, be used to determine whether specific mRNA 
20 molecules are present in a cell or tissue or to isolate similar nucleic acid sequences from 
chromosomal DNA as described by Walsh et d. (Wdsh, P.S. et d., 1992, PCR Methods 
Appl 1:241-250). They may be labeled by nick translation, Klenow fill-in reaction, PCR, 
or other methods well known in the art Probes of the present invention, their preparation 
and/or labeling are elaborated in Sambrook, J. et d., 1989, Molecular Cloning: A 
25 Laboratory Manud, Cold Spring Haibor Laboratory, NY; or Ausubel, FJvt et d., 1989, 
Current Protocols in Molecular Biology, John Wiley & Sons, New York NY, both of 
which are incorporated herein by reference in their entirety. 

The nucleic acid sequences of the present invention dso include the sequence 
information from the nucleic acid sequences of SEQ ID NOs: 1-438. The. sequence 
30 information can be a segment of any one of SEQ ID NOs: 1-438 that uniquely identifies 
or represents the sequence information of that sequence of SEQ ID NO: 1-438. One such 
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segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. In the human genome, there are 
three billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers 
exist, there are 300 times more twenty-mers than there are base pairs in a set of human 
5 chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5, When these segments are used in 
arrays for expression studies, fifteen-mer segments can be used. The probability that the 
fifteen-mer is fully matched in the expressed sequences is also approximately one in five 
because expressed sequences comprise less than approximately 5% of the entire genome 
10 sequence. 

Similarly, when using sequence information for detecting a single mismatch, a 
segment can be a twenty-five mer. The probability that the twenty-five mer would appear in 
a human genome with a single mismatch is calculated by multiplying the probability for a 
full match (1-J4 25 ) times the increased probability for mismatch at each nucleotide position 

15 (3 x 25). The probability that an eighteen mer with a single mismatch can be detected in an 
array for expression studies is approximately one in five. The probability that a twenty-mer 
with a single mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding 
for amino acids without any termination codons and is a sequence translatable into 

20 protein. 

The terms "operably linked" or "operabiy associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably 
linked with a coding sequence if the promoter controls the transcription of the coding 
sequence. While operably linked nucleic acid sequences can be contiguous and in the 
25 same reading frame, certain genetic elements e.g. repressor genes are not contiguously 
linked to the coding sequence but still control transcription/translation of the coding 
sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a 
number of differentiated cell types that are present in an adult organism. A pluripotent 
30 cell is restricted in its differentiation capability in comparison to a totipotent cell. 

10 
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The terms "polypeptide" or peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to 
naturally occurring or synthetic molecules. A polypeptide "fragment," "portion," or 
"segment" is a stretch of amino acid residues of at least about 5 amino acids, preferably at 
5 least about 7 amino acids, more preferably at least about 9 amino acids and most 

preferably at least about 17 or more amino acids. The peptide preferably is not greater 
than about 200 amino acids, more preferably less than ISO amino acids and most 
preferably less than 100 amino acids. Preferably the peptide is from about 5 to about 200 
amino acids. To be active, any polypeptide must have sufficient length to display 

10 biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by 
cells that have not been genetically engineered and specifically contemplates various 
polypeptides arising from post-transiational modifications of the polypeptide including, 
but not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation 

15 and acylation. 

The term /translated protein coding portion" means a sequence which encodes for 
the full length protein which may include any leader sequence or any processing 
sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
20 peptide or protein without a signal or leader sequence. Hie "mature protein portion" 

means that portion of the protein which does not include a signal or leader sequence. The 
peptide may have been produced by processing in the cell which removes any 
leader/signal sequence. Hie mature protein portion may or may not include the initial 
methionine residue. The methionine residue may be removed from the protein during 
25 processing in the cell. The peptide may be produced synthetically or the protein may 
have been produced using a polynucleotide only encoding for the mature protein coding 
sequence. 

The term "derivative" refers to polypeptides chemically modified by such 
techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), 
30 covalent polymer attachment such as pegylation (derivatization with polyethylene glycol) 
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and insertion or substitution by chemical synthesis of amino acids such as ornithine, 
which do not normally occur in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created 
5 using, e g., recombinant DNA techniques. Guidance in determining which amino acid 
residues may be replaced, added or deleted without abolishing activities of interest, may 
be found by comparing the sequence of the particular polypeptide with that of 
homologous peptides and minimizing the number of amino acid sequence changes made 
in regions of high homology (conserved regions) or by replacing amino acids with 

10 consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides 
may be synthesized or selected by making use of the "redundancy" in the genetic code. 
Various codon substitutions, such as the silent changes which produce various restriction 
sites, may be introduced to optimize cloning into a plasmid or viral vector or expression 

15 in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide 

sequence may be reflected in the polypeptide or domains of other peptides added to the 
polypeptide to modify the properties of any part of the polypeptide, to change 
characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. 

20 Preferably, amino acid "substitutions" are the result of replacing one amino acid 

with another amino acid having similar structural and/or chemical properties, 
conservative amino acid replacements. "Conservative" amino acid substitutions may be 
made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity, and/or the amphipathic nature of the residues involved. For example, 

25 nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, 
serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) 
amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino 
acids include aspartic acid and glutamic acid. "Insertions" or "deletions" are preferably 

30 in the range of about 1 to 20 amino acids, more preferably 1 to 10 amino acids. The 

variation allowed may be experimentally determined by systematically making insertions, 
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deletions, or substitutions of amino acids in a polypeptide molecule using recombinant 
DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 
5 alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, 
or degradation/turnover rate. Further, such alterations can be selected so as to generate 
polypeptides that are better suited for expression, scale up and the like in the host cells 

10 chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms ,, purified ,, or "substantially purified" as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other 
biological macromolecules, e.g., polynucleotides, proteins, and the like. In one 

15 embodiment, the polynucleotide or polypeptide is purified such that it constitutes at least 
95% by weight, more preferably at least 99% by weight, of the indicated biological 
macromolecules present (but water, buffers, and other small molecules, especially 
molecules having a molecular weight of less than 1000 daltons, can be present). 
The term "isolated" as used herein refers to a nucleic acid or polypeptide 

20 separated from at least one other component (e.g., nucleic acid or polypeptide) present 
with the nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic 
acid or polypeptide is found in the presence of (if anything) only a solvent, buffer, ion, or 
other component normally present in a solution of the same. Hie terms "isolated" and 
"purified" do not encompass nucleic acids or polypeptides present in their natural source. 

25 The term "recombinant," when used herein to refer to a polypeptide or protein, 

means that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or 
proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, 
"recombinant microbial" defines a polypeptide or protein essentially free of native 

30 endogenous substances and unaccompanied by associated native glycosylation. 

Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli % will be free of 
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glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern in general different from those expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage 
or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An 
5 expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a 
genetic element or elements having a regulatory role in gene expression, for example, 
promoters or enhancers, (2) a structural or coding sequence which is transcribed into 
mRNA and translated into protein, and (3) appropriate transcription initiation and 
termination sequences. Structural units intended for use in yeast or eukaryotic expression 

10 systems preferably include a leader sequence enabling extracellular secretion of 

translated protein by a host cell. Alternatively, where recombinant protein is expressed 
without a leader or transport sequence, it may include an amino terminal methionine 
residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

15 The term "recombinant expression system" means host cells which have stably 

integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extrachromosomally. Recombinant expression systems 
as defined herein will express heterologous polypeptides or proteins upon induction of 
the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

20 This term also means host cells which have stably integrated a recombinant genetic 

element or elements having a regulatory role in gene expression, for example, promoters 
or enhancers. Recombinant expression systems as defined herein will express 
polypeptides or proteins endogenous to the cell upon induction of the regulatory elements 
linked to the endogenous DNA segment or gene to be expressed The cells can be 

25 prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a 
membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g., soluble proteins) or partially receptors) from the cell 

30 in which they are expressed. "Secreted" proteins also include without limitation proteins 
that are transported across the membrane of the endoplasmic reticulum. "Secreted" 
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proteins are also intended to include proteins containing non-typical signal sequences 
(e.g. Interleukin-1 Beta, see Kxasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 
-143) and factors released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, 
see Arend, W.P. et. al (1998) Annu. Rev. Immunol. 16:27-55) 
5 Where desired, an expression vector may be designed to contain a "signal or 

leader sequence" which will direct the polypeptide through the membrane of a cell. Such 
a sequence may be naturally present on the polypeptides of the present invention or 
provided from heterologous protein sources by recombinant DNA techniques. 

Hie term "stringent" is used to refer to conditions that are commonly understood 

10 in the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 
hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 
1 mM EDTA at 65°C, and washing in 01X SSC/0.1% SDS at 68°C), and moderately 
stringent conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary 
hybridization conditions are described herein in the examples. 

15 In instances of hybridization of deoxyoligonucleotides, additional exemplary 

stringent hybridization conditions include washing in 6X SSC/0.05% sodium 
pyrophosphate at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligos), 55°C 
(for 20-base oligonucleotides), and 60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" or "substantially similar" can refer both 

20 to nucleotide and amino acid sequences, for example a mutant sequence, that varies from 
a reference sequence by one or more substitutions, deletions, or additions, the net effect 
of which does not result in an adverse functional dissimilarity between the reference and 
subject sequences. Typically, such a substantially equivalent sequence varies from one of 
those listed herein by no more than about 35% (*.e., the number of individual residue 

25 substitutions, additions, and/or deletions in a substantially equivalent sequence, as 
compared to the corresponding reference sequence, divided by the total number of 
residues in the substantially equivalent sequence is about 0.35 or less). Such a sequence 
is said to have 65% sequence identity to the listed sequence. In one embodiment, a 
substantially equivalent, e.g., mutant, sequence of the invention varies from a listed 

30 sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of 
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this embodiment, by no more than 20% (80% sequence identity) and in a further variation 
of this embodiment, by no more than 10% (90% sequence identity) and in a further 
variation of this embodiment, by no more that 5% (95% sequence identity). Substantially 
equivalent, e.g., mutant, amino acid sequences according to the invention preferably have 
5 at least 80% sequence identity with a listed amino acid sequence, more preferably at least 
85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least 95% sequence identity, more preferably at least 98% sequence identity, and most 
preferably at least 99% sequence identity. Substantially equivalent nucleotide sequence 
of the invention can have lower percent sequence identities, taking into account, for 

10 example, the redundancy or degeneracy of the genetic code. Preferably, the nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, 
more preferably at least about 80% sequence identity, more preferably at least 85% 
sequence identity, more preferably at least 90% sequence identity, mote preferably at 
least about 95% sequence identity, more preferably at least 98% sequence identity, and 

15 most preferably at least 99% sequence identity. For the purposes of the present 

invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent For the 
purposes of determining equivalence, truncation of the mature sequence (e.g. f via a 
mutation which creates a spurious stop codon) should be disregarded. Sequence identity 

20 may be determined, e.g., using the Jotun Hein method (Hein, J. (1990) Methods 

EnzymoL 183:626-645). Identity between sequences can also be determined by other 
methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to fee capability of a cell to differentiate into all of 
the cell types of an adult organism* 

25 The term "transformation" means introducing DNA into a suitable host cell so 

that the DNA is replicable, either as an extrachromosomal element, or by chromosomal 
integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the introduction of nucleic acids into a suitable host cell by use of a 

30 virus or viral vector. 
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As used herein, an "uptake modulating fragment," IMF, means a series of 
nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can 
be readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
5 confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic 
acid molecule is then incubated with an appropriate host under appropriate conditions and 
the uptake of the marker sequence is determined. As described above, a UMF will 
increase the frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, 
10 unless the context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising 

15 the nucleotide sequences of SEQ ID NO: 1 - 438; a polynucleotide encoding any one of 
the peptide sequences of SEQ ID NO:l - 438; and a polynucleotide comprising the 
nucleotide sequence encoding the mature protein coding sequence of the polynucleotides 
of any one of SEQ ID NO: 1 - 438. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent conditions 

20 to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1 - 438; (b) 
nucleotide sequences encoding any one of the amino acid sequences set forth in the 
Sequence listing; (c) a polynucleotide which is an allelic variant of any polynucleotide 
recited above; (d) a polynucleotide which encodes a species homolog of any of the 
proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 

25 specific domain or truncation of the polypeptides of SEQ ID NO: 1- 438. Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor- 
like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in inununoglobulin-like proteins include the 
variable immunoglobulin-like domains; domains in enzyme-like polypeptides include 

30 catalytic and substrate binding domains; and domains in ligand polypeptides include 
receptor-binding domains. 
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The polynucleotides of the invention include naturally occurring or wholly or 
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 
polynucleotides may include all of the coding region of the cDNA or may represent a 
portion of the coding region of the cDNA. 
5 The present invention also provides genes corresponding to the cDN A sequences 

disclosed herein. The corresponding genes can be isolated in accordance with known 
methods using the sequence information disclosed herein. Such methods include the 
preparation of probes or primers from the disclosed sequence information for identification 
and/or amplification of genes in appropriate genomic libraries or other sources of genomic 

10 materials. Further 5' and 3' sequence can be obtained using methods known in the art For 
example, full length cDNA or genomic DNA that corresponds to any of the polynucleotides 
of SEQ ID NO: 1 - 438 can be obtained by screening appropriate cDNA or genomic DNA 
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1 - 438 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

15 NO: 1 - 438 may be used as the basis for suitable primer(s) that allow identification and/or 
amplification of genes in appropriate genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and 
sequences (including cDNA and genomic sequences) obtained from one or more public 
databases, such as dbEST, gbpri, and UniGene. The EST sequences can provide identifying 

20 sequence information, representative fragment or segment information, or novel segment 
information for the full-length gene. 

The polynucleotides of the invention also provide polynucleotides including 
nucleotide sequences that are substantially equivalent to the polynucleotides recited 
above. Polynucleotides according to the invention can have, e.g., at least about 65%, at 

25 least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more 
typically at least about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 
91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99% 
sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are 

30 nucleic acid sequence fragments that hybridize under stringent conditions to any of the 
nucleotide sequences of SEQ ID NO: 1 - 438, or complements thereof, which fragment is 
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greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 
20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 
polynucleotides of the invention) are contemplated Probes capable of specifically 
5 hybridizing to a polynucleotide can differentiate polynucleotide sequences of the 
invention from other polynucleotide sequences in the same family of genes or can 
differentiate human genes from genes of other species, and are preferably based on 
unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to 

10 these specific sequences, but also include allelic and species variations thereof. Allelic and 
species variations can be routinely determined by comparing the sequence provided in SEQ 
ID NO: 1 - 438, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NOs: 1 - 438 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention 

15 includes nucleic acid molecules coding for the same amino acid sequences as do the specific 
ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one 
codon for another codon that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of die present 
invention, including SEQ ID NOs: 1 - 438, can be obtained by searching a database using an 

20 algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment 
Search Tool is used to search for local sequence alignments (Altshul, S J 5 . J Mol. Evol. 36 
290-300 (1993) and Altschul SP. et aL J. MoL Biol. 21:403-410 (1990)). Alternatively a 
FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 

25 also provided by the present invention. Species homologs may be isolated and identified 
by making suitable probes or primers from the sequences provided herein and screening a 
suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides 
or proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide 

30 which also encode proteins which are identical, homologous or related to that encoded by 
the polynucleotides. 
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The nucleic acid sequences of the invention are further directed to sequences 
which encode variants of the described nucleic acids. These amino acid sequence 
variants may be prepared by methods known in the art by introducing appropriate 
nucleotide changes into a native or variant polynucleotide. There are two variables in the 
5 construction of amino acid sequence variants: the location of the mutation and the nature 
of the mutation. Nucleic acids encoding the amino acid sequence variants are preferably 
constructed by mutating the polynucleotide to encode an amino acid sequence that does 
not occur in nature. These nucleic acid alterations can be made at sites that differ in the 
nucleic acids from different species (variable positions) or in highly conserved regions 

10 (constant regions). Sites at such locations will typically be modified in series, e.g., by 
substituting first with conservative choices (e.g., hydrophobic amino acid to a different 
hydrophobic amino acid) and then with more distant choices (e.g., hydrophobic amino 
acid to a charged amino acid), and then deletions or insertions may be made at the target 
site. Amino acid sequence deletions generally range from about 1 to 30 residues, 

15 preferably about 1 to 10 residues, and are typically contiguous. Amino acid insertions 
include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino 
acid residues. Intrasequence insertions may range generally from about 1 to 10 amino 
residues, preferably from 1 to 5 residues. Examples of terminal insertions include the 

20 heterologous signal sequences necessary for secretion or for intracellular targeting in 
different host cells and sequences such as FLAG or poly-histidine sequences useful for 
purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences 
are changed via site-directed mutagenesis. This method uses oligonucleotide sequences 

25 to alter a polynucleotide to encode the desired amino acid variant, as well as sufficient 
adjacent nucleotides on both sides of the changed amino acid to form a stable duplex on 
either side of the site of being changed. In general, the techniques of site-directed 
mutagenesis are well known to those of skill in the art and this technique is exemplified 
by publications such as, Edelman et al., DNA 2:183 (1983). A versatile and efficient 

30 method for producing site-specific changes in a polynucleotide sequence was published 
by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to 
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create amino acid sequence variants of the novel nucleic acids. When small amounts of 
template DNA are used as starting material, primer(s) that differs slightly in sequence 
from the corresponding region in the template DNA can generate the desired amino acid 
variant. PCR amplification results in a population of product DNA fragments that differ 
5 from the polynucleotide template encoding the polypeptide at the position specified by 
the primer. The product DNA fragments replace the corresponding region in the plasmid 
and this gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et aL, Gene 34:315 (1985); and other mutagenesis 

10 techniques well known in the art, such as, for example, the techniques in Sambrook et aL, 
supra, and Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent 
degeneracy of the genetic code, other DNA sequences which encode substantially the 
same or a functionally equivalent amino acid sequence may be used in the practice of the 
invention for the cloning and expression of these novel nucleic acids. Such DNA 

15 sequences include those which are capable of hybridizing to the appropriate novel nucleic 
acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can 
be used to generate polynucleotides encoding chimeric or fusion proteins comprising one 
or more domains of the invention and heterologous protein sequences. 

20 The polynucleotides of the invention additionally include the complement of any 

of the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate 

25 polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the 
mature protein coding sequences corresponding to any one of SEQ ID NO: 1-438, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in 

30 appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 
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A polynucleotide according to the invention can be joined to any of a variety of 
other nucleotide sequences by well-established recombinant DNA techniques (see 
Sambrook J et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY). Useful nucleotide sequences for joining to polynucleotides include an 
5 assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and 
the like, that are well known in the art Accordingly, the invention also provides a vector 
including a polynucleotide of the invention and a host cell containing the polynucleotide. 
In general, the vector contains an origin of replication functional in at least one organism, 
convenient restriction endonuclease sites, and a selectable marker for the host cell. 

10 Vectors according to the invention include expression vectors, replication vectors, probe 
generation vectors, and sequencing vectors. A host cell according to the invention can be 
a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a 
multicellular organism. 

The present invention further provides recombinant constructs comprising a 

15 nucleic acid having any of the nucleotide sequences of SEQ ID NOs: 1 - 438 or a 

fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or 
viral vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID 
NOs: 1 - 438 or a fragment thereof is inserted, in a forward or reverse orientation. In the 

20 case of a vector comprising one of the ORFs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those 
of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 

25 example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, 
pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, 
pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pS VK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an 

30 expression control sequence such as the pMT2 or pED expression vectors disclosed in 
Kaufman et al., Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein 
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recombinantly. Many suitable expression control sequences are known in the art 
General methods of expressing recombinant proteins are also known and are exemplified 
in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein 
"operably linked" means that the isolated polynucleotide of the invention and an 
5 expression control sequence are situated within a vector or cell in such a way that the 
protein is expressed by a host cell which has been transformed (transfected) with the 
ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

10 appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters 
include lacl, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late S V40, LTRs from retrovirus, and 
mouse metallothionein-L Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. Generally, recombinant expression vectors will 

15 include origins of replication and selectable markers permitting transformation of the host 
cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly-expressed gene to direct transcription of a downstream 
structural sequence. Such promoters can be derived from operons encoding glycolytic 
enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat 

20 shock proteins, among others. The heterologous structural sequence is assembled in 

appropriate phase with translation initiation and termination sequences, and preferably, a 
leader sequence capable of directing secretion of translated protein into the periplasmic 
space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

25 characteristics, e.g. , stabilization or simplified purification of expressed recombinant 
product Useful expression vectors for bacterial use are constructed by inserting a 
structural DNA sequence encoding a desired protein together with suitable translation 
initiation and termination signals in operable reading phase with a functional promoter. 
The vector will comprise one or more phenotypic selectable markers and an origin of 

30 replication to ensure maintenance of the vector and to, if desirable, provide amplification 
within the host Suitable prokaryotic hosts for transformation include E. coli, Bacillus 
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subtilis, Salmonella typhimurium and various species within the genera Pseudomonas, 
Streptomyces, and Staphylococcus, although others may also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
5 bacterial use can comprise a selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic elements of the well known 
cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, 
pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega 
Biotech, Madison, WI, USA). These pBR322 "backbone" sections are combined with an 

10 appropriate promoter and the structural sequence to be expressed Following 

transformation of a suitable host strain and growth of the host strain to an appropriate cell 
density, the selected promoter is induced or derepressed by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 
Cells are typically harvested by centrifugation, disrupted by physical or chemical means, 

15 and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. 
For example, as described in Fan et al., Nat Biotech. 17:870-872 (1999), incorporated 
herein by reference, nucleic acid sequences encoding a polypeptide may be used to 
generate antibodies against the encoded polypeptide following topical administration of 

20 naked plasmid DNA or following injection, and preferably intra-muscular injection of the 
DNA. The nucleic acid sequences are preferably inserted in a recombinant expression 
vector and may be in the form of naked DNA. 

43 ANTISENSE 

25 Another aspect of the invention pertains to isolated antisense nucleic acid 

molecules that are hybridizable to or complementary to the nucleic acid molecule 
comprising the nucleotide sequence of SEQ ID NO: 1 - 438, or fragments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the 

30 coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that 
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comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 
nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1 - 438 or antisense nucleic acids complementary to a nucleic acid sequence 
5 of SEQ ID NO: 1 - 438 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of the invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 

10 molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 

sequence of the invention. The term "noncoding region" refers to 5' and 3' sequences that 
flank the coding region that are not translated into amino acids (Le., also referred to as 5' 
and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., 

15 SEQ ID NO: 1 - 438, antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid 
molecule can be complementary to the entire coding region of an mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or 
noncoding region of an mRNA. For example, the antisense oligonucleotide can be 

20 complementary to the region surrounding the translation start site of an mRNA. An 
antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 
50 nucleotides in length. An antisense nucleic acid of the invention can be constructed 
using chemical synthesis or enzymatic ligation reactions using procedures known in the 
art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 

25 chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase 
the physical stability of the duplex formed between the antisense and sense nucleic acids, 
e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
Examples of modified nucleotides that can be used to generate the antisense 

30 nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, 
hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 
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5^axboxymethylanunomethyl-2-thiouridine, 5<arboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 

1- methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 

2- methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 
5 5-methylaminomethyldracil, 5-methoxyaminomethyl-24hiouracil, 

beta-D-mannosylqueosine, 5-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 
5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 

10 5-methyl-2-thiouracil, 3-(3-amino3-N-2-carboxypropyl) uracil, (acp3)w, and 

2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically 
using an expression vector into which a nucleic acid has been subcloned in an antisense 
orientation (i.e„ RNA transcribed from the inserted nucleic acid will be of an antisense 
orientation to a target nucleic acid of interest, described further in the following 

15 subsection). 

The antisense nucleic acid molecules of the invention are typically administered 
to a subject or generated in situ such that they hybridize with or bind to cellular mRNA 
and/or genomic DNA encoding a protein according to the invention to thereby inhibit 
expression of the protein, e.g„ by inhibiting transcription and/or translation. The 

20 hybridization can be by conventional nucleotide complementarity to form a stable duplex, 
or, for example, in the case of an antisense nucleic acid molecule that binds to DNA 
duplexes, through specific interactions in the major groove of the double helix. An 
example of a route of administration of antisense nucleic acid molecules of the invention 
includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules 

25 can be modified to target selected cells and then administered systemically. For example, 
for systemic administration, antisense molecules can be modified such that they 
specifically bind to receptors or antigens expressed on a selected cell surface, by 
linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
surface receptors or antigens. The antisense nucleic acid molecules can also be delivered 

30 to cells using the vectors described herein. To achieve sufficient intracellular 

concentrations of antisense molecules, vector constructs in which the antisense nucleic 
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acid molecule is placed under the control of a strong pol II or pol HI promoter are 
preferred 

In yet another embodiment, the antisense nucleic acid molecule of the invention is 
an cc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms 
5 specific double-stranded hybrids with complementary RNA in which, contrary to the usual 
a-units, the strands nm parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 
6625-6641). The antisense nucleic acid molecule can also comprise a 
^-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a 
chimeric RNA -DNA analogue (Inoue et al (1987) FEBS Lett 215: 327-330). 

10 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a 
ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity that are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have 

15 a complementary region. Thus, ribozymes (e.g. , hammerhead ribozymes (described in 
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having 
specificity for a nucleic acid of the invention can be designed based upon the nucleotide 
sequence of a DNA disclosed herein (Le., SEQ ID NO: 1 - 438). For example, a 

20 derivative of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide 

sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 
SECX-encoding mRNA, See, e.g.,Cechetal. U.S. Pat. No. 4,987,071; andCech** a/. 
ILS. Pat No. 5,1 16,742. Alternatively, SECX mRNA can be used to select a catalytic 
RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., 

25 Bartel et al, (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region {e.g., promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, 
Helene. (1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. NY. Acad. 

30 Sci. 660:27-36; and Maher (1992) Bioassays 14: 807-15. 
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In various embodiments, the nucleic acids of the invention can be modified at the 
base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see 
5 Hyrup et al. (1996) Bioorg Med Chem 4: 5-23). As used herein, the terms "peptide 
nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the 
deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the 
four natural nucleobases are retained. The neutral backbone of PNAs has been shown to 
allow for specific hybridization to DNA and RNA under conditions of low ionic strength. 

10 The synthesis of PNA oligomers can be performed using standard solid phase peptide 
synthesis protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al. (1996) 
PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific 

15 modulation of gene expression by, e.g., inducing transcription or translation arrest or 
inhibiting replication. PNAs of the invention can also be used, e.g., in the analysis of 
single base pair mutations in a gene by, e.g., PNA directed PCR clamping; as artificial 
restriction enzymes when used in combination with other enzymes, e.g., SI nucleases 
(Hyrup B. (1996) above); or as probes or primers for DNA sequence and hybridization 

20 (Hyrup et al. (1996), above; Perry-OKeefe (1996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of 
drug delivery known in the art. For example, PNA-DNA chimeras can be generated that 

25 may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms 
of base stacking, number of bonds between the nucleobases, and orientation (Hyrup 

30 (1996) above). The synthesis of PNA-DNA chimeras can be performed as described in 
Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a 
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DNA chain can be synthesized on a solid support using standard phosphoramidite 
coupling chemistry, and modified nucleoside analogs, e.g., 

S-C^methoxytrityOamino-S'-deoxy-thyinidine phosphoramidite, can be used between the 
PNA and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 17: 5973-88). PNA 
5 monomers are then coupled in a stepwise manner to produce a chimeric molecule with a 
5 1 PNA segment and a 3' DNA segment (Finn et al. (1996) above). Alternatively, 
chimeric molecules can be synthesized with a 5 1 DNA segment and a 3' PNA segment. 
See, Petersen** al. (1975) Bioorg Med Chem Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups 

10 such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating 

transport across the cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad Sci. 
U.S.A 86:6553-6556; Lemaitre etal, 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT 
Publication No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. 
W089/10134). Li addition, oligonucleotides can be modified with hybridization triggered 

15 cleavage agents (See, e.g., Krol et al., 1988, BioTechniques 6:958-976) or intercalating 
agents. (See, e.g., Zon, 1988, Pharnu Res. 5: 539-549). To this end, the oligonucleotide 
may be conjugated to another molecule, e.g., a peptide, a hybridization triggered 
cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, etc. 

20 4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain 
the polynucleotides of the invention. For example, such host cells may contain nucleic 
acids of the invention introduced into the host cell using known transformation, 
transfection or infection methods. The present invention still further provides host cells 

25 genetically engineered to express the polynucleotides of the invention, wherein such 
polynucleotides are in operative association with a regulatory sequence heterologous to 
the host cell which drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, 
or increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 

30 homologous recombination) to provide increased polypeptide expression by replacing, in 
whole or in part, the naturally occurring promoter with all or part of a heterologous 
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promoter so that the cells express the polypeptide at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the encoding 
sequences. See, for example, PCT International Publication No. W094/1 2650, PCT 
International Publication No. WO92/20808, and PCT International Publication No. 
5 WO91/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) 
and/or intron DNA may be inserted along with the heterologous promoter DNA. If 
linked to the coding sequence, amplification of the marker DNA by standard selection 

10 methods results in co-amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a 
lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, 
such as a bacterial cell. Introduction of the recombinant construct into the host cell can 
be effected by calcium phosphate transfection, DEAE, dextran mediated transfection, or 

15 electroporation (Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host 
cells containing one of the polynucleotides of the invention, can be used in conventional 
manners to produce the gene product encoded by the isolated fragment (in the case of an 
ORF) or can be used to produce a heterologous protein under the control of the EMR 
Any host/vector system can be used to express one or more of the ORFs of the 

20 present invention. These include, but are not limited to, eukaryotic hosts such as HeLa 
cells, Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. 
coli and B. subtilis. The most preferred cells are those which do not normally express the 
particular polypeptide or protein or which expresses the polypeptide or protein at low 
natural level. Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 

25 other cells under the control of appropriate promoters. Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived from the DNA constructs 
of the present invention. Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook, et al., in Molecular 
Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New York (4989), 

30 the disclosure of which is hereby incorporated by reference. 
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Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 
lines of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other 
cell lines capable of expressing a compatible vector are, for example, the C127, monkey 
5 COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human 
epidermal A431 cells, human Colo205 cells, 3T3 cells, CV-1 cells, other transformed 
primate cell lines, normal diploid cells, cell strains derived from in vitro culture of 
primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or 
Jurkat cells. Mammalian expression vectors will comprise an origin of replication, a 

10 suitable promoter and also any necessary ribosome binding sites, polyadenylation site, 
splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
. nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for 
example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be 
used to provide the required nontranscribed genetic elements. Recombinant polypeptides 

15 and proteins produced in bacterial culture are usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion 
chromatography steps. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. Microbial cells employed in 

20 expression of proteins can be disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell lysing agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such 
as yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains 
include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, 

25 Candida, or any yeast strain capable of expressing heterologous proteins. Potentially 
suitable bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella 
typhimurium, or any bacterial strain capable of expressing heterologous proteins. If the 
protein is made in yeast or bacteria, it may be necessary to modify the protein produced 
therein, for example by phosphorylation or glycosylation of the appropriate sites, in order 

30 to obtain the functional protein. Such covalent attachments may be accomplished using 
known chemical or enzymatic methods. 
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In another embodiment of the present invention, cells and tissues may be 
engineered to express an endogenous gene comprising the polynucleotides of the 
invention under the control of inducible regulatory elements, in which case the regulatory 
sequences of the endogenous gene may be replaced by homologous recombination. As 
5 described herein, gene targeting can be used to replace a gene's existing regulatory region 
with a regulatory sequence isolated from a different gene or a novel regulatory sequence 
synthesized by genetic engineering methods. Such regulatory sequences may be 
comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory 
elements, transcriptional initiation sites, regulatory protein binding sites or combinations 

10 of said sequences. Alternatively, sequences which affect the,structure or stability of the 
RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, 
splice sites, leader sequences for enhancing or modifying transport or secretion properties 
of the protein, or other sequences which alter or improve the function or stability of 

15 protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing 
the gene under the control of the new regulatory sequence, e.g., inserting a new promoter 
or enhancer or both upstream of a gene. Alternatively, the targeting event may be a 
simple deletion of a regulatory element, such as the deletion of a tissue-specific negative 

20 regulatory element. Alternatively, the targeting event may replace an existing element; 
for example, a tissue-specific enhancer can be replaced by an enhancer that has broader 
or different cell-type specificity than the naturally occurring elements. Here, the 
naturally occurring sequences are deleted and new sequences are added. In all cases, the 
identification of the targeting event may be facilitated by the use of one or more 

25 selectable marker genes that are contiguous with the targeting DNA, allowing for the 
selection of cells in which the exogenous DNA has integrated into the host cell genome. 
The identification of the targeting event may also be facilitated by the use of one or more 
marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the 

30 negatively selectable marker flanks the targeting sequence, and such that a correct 

homologous recombination event with sequences in the host cell genome does not result 
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in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 
5 with this aspect of the invention are more particularly described in U.S. Patent No. 
5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et aL; International 
Application No. PCT/US92/09627 (WO93/09222) by Selden et aL; and International 
Application No. PCT/US90/06436 (W09 1/06667) by Skoultchi et aL, each of which is 
incorporated by reference herein in its entirety. 

10 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1- 
438 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID 

15 NOs: 1 - 438 or the corresponding full length or mature protein. Polypeptides of the 

invention also include polypeptides preferably with biological or immunological activity 
that are encoded by: (a) a polynucleotide having any one of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 438 or (b) polynucleotides encoding any one of the amino acid 
sequences set forth as SEQ ID NO: 1-438 or (c) polynucleotides that hybridize to the 

20 complement of the polynucleotides of either (a) or (b) under stringent hybridization 
conditions. The invention also provides biologically active or immunologically active 
variants of any of the amino acid sequences set forth as SEQ ID NO: 1-438 or the 
corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., 
with at least about 65%, at least about 70%, at least about 75%, at least about 80%, at 

25 least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, 
typically at least about 95%, 96%, 97%, more typically at least about 98%, or most 
typically at least about 99% amino acid identity) that retain biological activity. 
Polypeptides encoded by allelic variants may have a similar, increased, or decreased 
activity compared to polypeptides comprising SEQ ID NO: 1-438. 

30 Fragments of the proteins of the present invention which are capable of exhibiting 

biological activity are also encompassed by the present invention. Fragments of the 
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protein may be in linear form or they may be cyclized using known methods, for 
example, as described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and 
in R. S. McDowell, et al., J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are 
incorporated herein by reference. Such fragments may be fused to carrier molecules such 
5 as immunoglobulins for many purposes, including increasing the valency of protein 
binding sites. 

Hie present invention also provides both full-length and mature forms (for 
example, without a signal sequence or precursor sequence) of the disclosed proteins. The 
protein coding sequence is identified in the sequence listing by translation of the 

10 disclosed nucleotide sequences. The mature form of such protein may be obtained by 
expression of a full-length polynucleotide in a suitable mammalian cell or other host cell. 
The sequence of the mature form of the protein is also determinable from the amino acid 
sequence of the full-length form. Where proteins of the present invention are membrane 
bound, soluble forms of the proteins are also provided- In such forms, part or all of the 

15 regions causing the proteins to be membrane bound are deleted so that the proteins ate 
fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable 
carrier, such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the 

20 nucleic acid fragments of the present invention or by degenerate variants of the nucleic 
acid fragments of the present invention. By "degenerate variant" is intended nucleotide 
fragments which differ from a nucleic acid fragment of the present invention (e.g. 9 an 
ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an 
identical polypeptide sequence. Preferred nucleic acid fragments of the present invention 

25 are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of 
the isolated polypeptides or proteins of the present invention. At the simplest level, the 
amino acid sequence can be synthesized using commercially available peptide 
synthesizers. The synthetically-constructed protein sequences, by virtue of sharing 

30 primary, secondary or tertiary structural and/or conformational characteristics with 
proteins may possess biological properties in common therewith, including protein 
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activity. This technique is particularly useful in producing small peptides and fragments 
of larger polypeptides. Fragments are useful, for example, in generating antibodies 
against the native polypeptide. Thus, they may be employed as biologically active or 
immunological substitutes for natural, purified proteins in screening of therapeutic 
5 compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be 
purified from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or protein 
when the cell, through genetic manipulation, is made to produce a polypeptide or protein 
10 which it normally does not produce or which the cell normally produces at a lower level. 
One skilled in the art can readily adapt procedures for introducing and expressing either 
recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to 
generate a cell which produces one of the polypeptides or proteins of the present 
invention. 

15 The invention also relates to methods for producing a polypeptide comprising 

growing a culture of host cells of the invention in a suitable culture medium, and 
purifying the protein from the cells or the culture in which the cells are grown. For 
example, the methods of the invention include a process for producing a polypeptide in 
which a host cell containing a suitable expression vector that includes a polynucleotide of 

20 the invention is cultured under conditions that allow expression of the encoded 

polypeptide. Hie polypeptide can be recovered from the culture, conveniently from the 
culture medium, or from a lysate prepared from the host cells and further purified 
Preferred embodiments include those in which the protein produced by such process is a 
full length or mature form of the protein. 

25 In an alternative method, the polypeptide or protein is purified from bacterial 

cells which naturally produce the polypeptide or protein. One skilled in the art can 
readily follow known methods for isolating polypeptides and proteins in order to obtain 
one of the isolated polypeptides or proteins of the present invention. These include, but 
are not limited to, immunochromatography, HPLC, size-exclusion chromatography, 

30 ion-exchange chromatography, and immuno-affinity chromatography. See, e.g., Scopes, 
Protein Purification: Principles and Practice, Springer-Verlag (1994); Sambrook, et al., 
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in Molecular Cloning: A Laboratory Manual\ Ausubel et aL, Current Protocols in 
Molecular Biology. Polypeptide fragments that retain biological/immunological activity 
include fragments comprising greater than about 100 amino acids, or greater than about 
200 amino acids, and fragments that encode specific protein domains. 
5 The purified polypeptides can be used in in vitro binding assays which are well 

known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 
libraries, antibodies or other proteins. The molecules identified in the binding assay are 
then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
10 that are well known in the art In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 
the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the 
peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds 
15 that are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or 
other cell by the specificity of the binding molecule for SEQ ID NO: 1-438. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which 
are characterized by somatic or germ cells containing a nucleotide sequence encoding the 
20 protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered For example, modifications, in the peptide or DNA 
sequence, can be made by those skilled in the art using known techniques. Modifications 

25 of interest in the protein sequences may include the alteration, substitution, replacement, 
insertion or deletion of a selected amino acid residue in the coding sequence. For 
example, one or more of the cysteine residues may be deleted or replaced with another 
amino acid to alter the conformation of the molecule. Techniques for such alteration, 
substitution, replacement, insertion or deletion are well known to those skilled in the art 

30 (see, e.g., U.S. PaL No. 4,518,584). Preferably, such alteration, substitution, replacement, 
insertion or deletion retains the desired activity of the protein. Regions of the protein that 
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are important for the protein function can be determined by various methods known in 
the art including the alanine-scanning method which involved systematic substitution of 
single or strings of amino acids with alanine, followed by testing the resulting 
alanine-containing variant for biological activity. This type of analysis determines the 
5 importance of the substituted amino acid(s) in biological activity. Regions of the protein 
that are important for protein function may be determined by the eMATRK program. 

Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given 

10 the disclosures herein. Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide 
of the invention to suitable control sequences in one or more insect expression vectors, 
and employing an insect expression system. Materials and methods for 
baculovirus/insect cell expression systems are commercially available in kit form from, 

15 e.g., Invitrogen, San Diego, Calif., U.S. A. (the MaxBat™ kit), and such methods are well 
known in the art, as described in Summers and Smith, Texas Agricultural Experiment 
Station Bulletin No. 1555 (1987), incorporated herein by reference. As used herein, an 
insect cell capable of expressing a polynucleotide of the present invention is 
"transformed." 

20 The protein of the invention may be prepared by culturing transformed host cells 

under culture conditions suitable to express the recombinant protein. The resulting 
expressed protein may then be purified from such culture (Le. y from culture medium or 
cell extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 

25 containing agents which will bind to the protein; one or more column steps over such 
affinity resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA 
Sepharose™; one or more steps involving hydrophobic interaction chromatography using 
such resins as phenyl ether, butyl ether, or propyl ether, or immunoaffinity 
chromatography. 

30 Alternatively, the protein of the invention may also be expressed in a form which 

will facilitate purification. For example, it may be expressed as a fusion protein, such as 
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those of maltose binding protein (MBP), glutathiones-transferase (GST) or thioredoxin 
(IRX), or as a His tag. Kits for expression and purification of such fusion proteins are 
commercially available from New England BioLab (Beverly, Mass.), Pharmacia 
(Piscataway, N J.) and Ihvitrogen, respectively. Hie protein can also be tagged with an 
5 epitope and subsequently purified by using a specific antibody directed to such epitope. 
One such epitope ("FLAG®") is commercially available from Kodak (New Haven, 
Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- 
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant 

10 methyl or other aliphatic groups, can be employed to further purify the protein. Some or 
all of the foregoing purification steps, in various combinations, can also be employed to 
provide a substantially homogeneous isolated recombinant protein. The protein thus 
purified is substantially free of other mammalian proteins and is defined in accordance 
with the present invention as an "isolated protein." 

15 The polypeptides of the invention include analogs (variants). This embraces 

fragments, as well as peptides in which one or more amino acids has been deleted, 
inserted, or substituted. Also, analogs of the polypeptides of the invention embrace 
fusions of the polypeptides or modifications of the polypeptides of the invention, wherein 
the polypeptide or analog is fused to another moiety or moieties, e.g., targeting moiety or 

20 another therapeutic agent. Such analogs may exhibit improved properties such as activity 
and/or stability. Examples of moieties which may be fused to the polypeptide or an 
analog include, for example, targeting moieties which provide for the delivery of 
polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, antibodies to immune 
cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well as receptor 

25 and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for 
example, immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 
antibodies and steroids. Also, polypeptides may be fused to immune modulators, and 
other cytokines such as alpha or beta interferon. 

30 
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4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match 
between the sequences tested. Methods to determine identity and similarity are codified 
5 in computer programs including, but are not limited to, the GCG program package, 
including GAP (Deveieux, J., et aL, Nucleic Acids Research 12(1):387 (1984); Genetics 
Computer Group, University of Wisconsin, Madison, WI), BLASTP, BLASTN, 
BLASTX, PASTA (Altschul, S.F. et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST 
(Altschul S.F. et al., Nucleic Acids Res. vol. 25, pp. 3389-3402, herein incorporated by 

10 reference), eMatrix software (Wu et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), 

herein incorporated by reference), eMotif software (NeviU-Manning et al, ISMB-97, Vol. 
4, pp. 202-209, herein incorporated by reference), pFam software (Sonnhammer et al., 
Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by reference) 
and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 105-31 

15 (1982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources 
(BLAST Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., 
et al., J. Mol. Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

20 The invention also provides chimeric or fusion proteins. As used herein, a 

"chimeric protein" or "fusion protein" comprises a' polypeptide of the invention 
operatively linked to another polypeptide. Within a fusion protein the polypeptide 
according to the invention can correspond to all or a portion of a protein according to the 
invention. In one embodiment, a fusion protein comprises at least one biologically active 

25 portion of a protein according to the invention. In another embodiment, a fusion protein 
comprises at least two biologically active portions of a protein according to the invention. 
Within the fusion protein, the term "operatively linked" is intended to indicate that the 
polypeptide according to the invention and the other polypeptide are fused in-frame to 
each other. The polypeptide can be fused to the N-terminus or C-terminus, or to the 

30 middle. 
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For example, in one embodiment a fusion protein comprises a polypeptide 
according to the invention operably linked to the extracellular domain of a second 
protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
5 polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 
which the polypeptide sequences according to the invention comprise one or more 
domains fused to sequences derived from a member of the immunoglobulin protein 

10 family. The immunoglobulin fusion proteins of the invention can be incorporated into 
pharmaceutical compositions and administered to a subject to inhibit an interaction 
between a ligand and a protein of the invention on the surface of a cell, to thereby 
suppress signal transduction in vivo. The immunoglobulin fusion proteins can be used to 
affect the bioavailability of a cognate ligand. Inhibition of the ligand/protein interaction 

15 may be useful therapeutically for both the treatment of proliferative and differentiative 
disorders, e.g., cancer as well as modulating (e.g., promoting or inhibiting) cell survival. 
Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 
to identify molecules that inhibit the interaction of a polypeptide of the invention with a 

20 ligand 

A chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g. , by employing blunt-ended or stagger-ended termini for ligation, 

25 restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends 
as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and 
enzymatic ligation. In another embodiment, the fusion gene can be synthesized by 
conventional techniques including automated DNA synthesizers. Alternatively, PCR 
amplification of gene fragments can be carried out using anchor primers that give rise to 

30 complementary overhangs between two consecutive gene fragments that can 

subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John 
Wiley & Sons, 1992). Moreover, many expression vectors are commercially available 
that already encode a fusion moiety {e.g., a GST polypeptide). A nucleic acid encoding a 
polypeptide of the invention can be cloned into such an expression vector such that the 
5 fusion moiety is linked in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of 
normal function of the encoded protein. The invention thus provides gene therapy to 

TO restore normal activity of the polypeptides of the invention; or to treat disease states 
involving polypeptides of the invention. Delivery of a functional gene encoding 
polypeptides of the invention to appropriate cells is effected ex vivo, in situ, or in vivo by 
use of vectors, and more particularly viral vectors (e.g., adenovirus, adeno-associated 
virus, or a retrovirus), or ex vivo by use of physical DNA transfer methods (e.g., 

15 liposomes or chemical treatments). See, for example, Anderson, Nature, supplement to 
vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of gene therapy technology 
see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific American: 68-84 
(1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of the 
nucleotides of the present invention or a gene encoding the polypeptides of the present 

20 invention can also be accomplished with extrachromosomal substrates (transient 

expression) or artificial chromosomes (stable expression). Cells may also be cultured ex 
vivo in the presence of proteins of the present invention in order to proliferate or to 
produce a desired effect on or activity in such cells. Treated cells can then be introduced 
in vivo for therapeutic purposes. Alternatively, it is contemplated that in other human 

25 disease states, preventing the expression of or inhibiting the activity of polypeptides of 
the invention will be useful in treating the disease states. It is contemplated that antisense 
therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of 

30 antisense molecules to the nucleic acids of the present invention, their complements, or their 
translated RNA sequences, by methods known in the art Further, the polypeptides of the 
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present invention can be inhibited by using targeted deletion methods, or the insertion of a 
negative regulatory element such as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to 
express the polynucleotides of the invention, wherein such polynucleotides are in operative 
5 association with a regulatory sequence heterologous to the host cell which drives expression 
of the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 

10 modified (e.g., by homologous recombination) to provide increased polypeptide expression 
by replacing, in whole or in part, the naturally occurring promoter with all or part of a 
heterologous promoter so that the cells express the protein at higher levels. Hie heterologous 
promoter is inserted in such a manner that it is operatively linked to the desired protein 
encoding sequences. See, for example, PCT International Publication No. WO 94/12650, 

15 PCT International Publication No. WO 92/20808, and PCT International Publication No. 
WO 91/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which encodes 
carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron 
DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 

20 protein coding sequence, amplification of the marker DNA by standard selection methods 
results in co-amplification of the desired protein coding sequences in the cells. 

Li another embodiment of the present invention, cells and tissues may be engineered 
to express an endogenous gene comprising the polynucleotides of the invention under the 
control of inducible regulatory elements, in which case the regulatory sequences of the 

25 endogenous gene may be replaced by homologous recombination. As described herein, 
gene targeting can be used to replace a gene's existing regulatory region with a regulatory 
sequence isolated from a different gene or a novel regulatory sequence synthesized by 
genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 

30 initiation sites, regulatory protein binding sites or combinations of said sequences. 
Alternatively, sequences which affect the structure or stability of the RNA or protein 
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produced may be replaced, removed, added, or otherwise modified by targeting. TTiese 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 
5 molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 

10 element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added In all cases, the identification of the 
targeting event may be facilitated by the use of one or more selectable marker genes that are 

15 contiguous with the targeting DNA, allowing for the selection of cells in which the 

exogenous DNA has integrated into the cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting 

20 sequence, and such that a conect homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 

25 with this aspect of the invention are more particularly described in U.S. Patent No. 

5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application 
No. PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

30 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the- 
invention in vi vo, one or more genes provided by the invention are either over expressed 
or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
5 regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 

10 the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

15 Transgenic animals can be prepared wherein all or part of a promoter of the 

polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
supplementing or even replacing the homologous promoter to provide for increased 

20 protein expression. The homologous promoter can be supplemented by insertion of one 
or more heterologous enhancer elements known to confer promoter activation in a 
particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 
25 express polypeptides of the invention or that express a variant polypeptide. Such animals 
are useful as models for studying the in vivo activities of polypeptide as well as for 
studying modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed 
30 or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
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regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5 5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 
the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

10 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of 
the invention promoter is either activated or inactivated to alter the level of expression of 
the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 

15 even replacing the homologous promoter to provide for increased protein expression. The 
homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

20 The polynucleotides and proteins of the present invention are expected to exhibit 

one or more of the uses or biological activities (including those associated with assays 
cited herein) identified herein. Uses or activities described for proteins of the present 
invention may be provided by administration or use of such proteins or of 
polynucleotides encoding such proteins (such as, for example, in gene therapies or 

25 vectors suitable for introduction of DNA). The mechanism underlying the particular 
condition or pathology will dictate whether the polypeptides of the invention, the 
polynucleotides of the invention or modulators (activators or inhibitors) thereof would be 
beneficial to the subject in need of treatment. Thus, "therapeutic compositions of the 
invention" include compositions comprising isolated polynucleotides (including 

30 . recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and 
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truncations or domains thereof), or compounds and other substances that modulate the 
overall activity of the target gene products, either at the level of target gene/protein 
expression or target protein activity. Such modulators include polypeptides, analogs, 
(variants), including fragments and fusion proteins, antibodies and other binding proteins; 
5 chemical compounds that directly or indirectly activate or inhibit the polypeptides of the 
invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 
antibodies or other binding partners that specifically recognize one or more epitopes of 
the polypeptides of the invention. 
10 The polypeptides of the present invention may likewise be involved in cellular 

activation or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

< The polynucleotides provided by the present invention pan be used by the 

15 research community for various purposes. The polynucleotides can be used to express 
recombinant protein for analysis, characterization or therapeutic use; as markers for 
tissues in which the corresponding protein is preferentially expressed (either 
constitutively or at a particular stage of tissue differentiation or development or in disease 
states); as molecular weight markers on gels; as chromosome markers or tags (when 

20 labeled) to identify chromosomes or to map related gene positions; to compare with 

endogenous DNA sequences in patients to identify potential genetic disorders; as probes 
to hybridize and thus discover novel, related DNA sequences; as a source of information 
to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and 

25 making oligomers for attachment to a "gene chip" or other support, including for 
examination of expression patterns; to raise anti-protein antibodies using DNA 
immunization techniques; and as an antigen to raise anti-DNA antibodies or elicit another 
immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand 

30 interaction), the polynucleotide can also be used in interaction trap assays (such as, for 
example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
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polynucleotides encoding the other protein with which binding occurs or to identify 
inhibitors of the binding interaction. 

The polypeptides provided by the present invention can similarly be used in 
assays to determine biological activity, including in a panel of multiple proteins for 
5 high-throughput screening; to raise antibodies or to elicit another immune response; as a 
reagent (including the labeled reagent) in assays designed to quantitatively determine 
levels of the protein (or its receptor) in biological fluids; as markers for tissues in which 
the corresponding polypeptide is preferentially expressed (either constitutively or at a 
particular stage of tissue differentiation or development or in a disease state); and, of 

10 course, to isolate correlative receptors or ligands. Proteins involved in these binding 

interactions can also be used to screen for peptide or small molecule inhibitors or agonists 
of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent 
grade or kit format for commercialization as research products. 

15 Methods for performing the uses listed above are well known to those skilled in 

the art. References disclosing such methods include without limitation "Molecular 
Cloning: A Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, 
Sambrook, J., E. R Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: 
Guide to Molecular Cloning Techniques", Academic Press, Berger, S. L. and A. R. 

20 Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
nutritional sources or supplements. Such uses include without limitation use as a protein or 

25 amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source 
of carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be 
added to the feed of a particular organism or can be administered as a separate solid or liquid 
preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the 
case of microorganisms, the polypeptide or polynucleotide of the invention can be added to 

30 the medium in or on which the microorganism is cultured. 
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4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, 
cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 
5 inhibiting) activity or may induce production of other cytokines in certain cell 

populations, A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Many protein factors discovered to date, including all known cytokines, have 
exhibited activity in one or more factor-dependent cell proliferation assays, and hence the 
assays serve as a convenient confirmation of cytokine activity. The activity of therapeutic 

10 compositions of the present invention is evidenced by any one of a number of routine 
factor dependent cell proliferation assays for cell lines including, without limitation, 32D, 
DA2, DA1G, T10, B9, B9/11, BaF3, MC9/G, M4<preB M+), 2E8, RB5, DAI, 123, 
T1165, HT2, CTLL2, TF-1, Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions 
of the invention can be used in the following: 

15 Assays for T-cell or thymocyte proliferation include without limitation those 

described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3. 19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 

20 1986; Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular 
Immunology 133:327-341, 1991; Bertagnolli, et al., L Immunol. 149:3778-3783, 1992; 
Bowman et al., I. Immunol. 152:1756-1761, 1994, 

Assays for cytokine production and/or proliferation of spleen cells, lymph node 
cells or thymocytes include, without limitation, those described in: Polyclonal T cell 

25 stimulation, Kruisbeek, A. M. and Shevach, R M. In Current Protocols in Immunology. 
J. E. e.a. Coligan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and 
Measurement of mouse and human interleukin-Y, Schreiber, R. D. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 
1994. 

30 Assays for proliferation and differentiation of hematopoietic and lymphopoietic 

cells include, without limitation, those described in: Measurement of Human and Murine 
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Interleukin 2 and Interieukin 4, Bottomly, K., Davis, L. S. and Iipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et al., J. Exp. Med 173:1205-1211, 1991; Moreau et aL, 
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 
5 80:293 1-2938, 1983; Measurement of mouse and human interleukin 6-Nordan, R. In 
Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley 
and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; 
Measurement of human Interleukin 1 1-Bennett, F M Giannotti, J., Clark, S. C. and Turner, 
K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John 

10 Wileyarid Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. G and Turner, K. J. Li Cunent Protocols in 
Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, 
proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 

15 proliferation and cytokine production) include, without limitation, those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 
6, Cytokines and their cellular receptors; Chapter 7, Immunologic studies in Humans); 

20 Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al„ 
Eur. J. Immun. 11:405-411, 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai 
et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

25 A polypeptide of the present invention may exhibit stem cell growth factor 

activity and be involved in the proliferation, differentiation and survival of pluripotent 
and totipotent stem cells including primordial germ cells, embryonic stem cells, 
hematopoietic stem cells and/or germ line stem cells. Administration of the polypeptide 
of the invention to stem cells in vivo or ex vivo is expected to maintain and expand cell 

30 populations in a totipotential or pluripotential state which would be useful for re- 
engineering damaged or diseased tissues, transplantation, manufacture of bio- 
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pharmaceuticals and the development of bio-sensors. The ability to produce large 
quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, 
implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
5 neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or 

10 cytokines may be administered in combination with the polypeptide of the invention to 
achieve the desired effect, including any of the growth factors listed herein, other stem 
cell maintenance factors, and specifically including stem cell factor (SCF), leukemia 
inhibitory factor (LJF)» Ht-3 ligand (Flt-3L), any of the interleukins, recombinant soluble 
IL-6 receptor fused to IL-6, macrophage inflammatory protein 1-alpha (MDM-alpha), G- 

15 CSF, GM-CSF, thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth 
factor (PDGF), neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, 
expansion of these cells in culture will facilitate the production of large quantities of 
mature cells. Techniques for culturing stem cells are known in the art and administration 

20 of polypeptides of the invention, optionally with other growth factors and/or cytokines, is 
expected to enhance the survival and proliferation of the stem cell populations. This can 
be accomplished by direct administration of the polypeptide of the invention to the 
culture medium. Alternatively, stroma cells transfected with a polynucleotide that 
encodes for the polypeptide of the invention can be used as a feeder layer for the stem 

25 cell populations in culture or in vivo. Stromal support cells for feeder layers may include 
embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 
induce autocrine expression of the polypeptide of the invention. This will allow for 

30 generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as 
is or that can then be differentiated into the desired mature cell types. These stable cell 
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lines can also serve as a source of undifferentiated totipotential/pluripotential mRNA to 
create cDNA libraries and templates for polymerase chain reaction experiments. These 
studies would allow for the isolation and identification of differentially expressed genes 
in stem cell populations that regulate stem cell proliferation and/or maintenance. 
5 Expansion and maintenance of totipotent stem cell populations will be useful in 

the treatment of many pathological conditions. For example, polypeptides of the present 
invention may be used to manipulate stem cells in culture to give rise to neuroepithelial 
cells that can be used to augment or replace cells damaged by illness, autoimmune 
disease, accidental damage or genetic disorders. The polypeptide of the invention may be 

10 useful for inducing the proliferation of neural cells and for the regeneration of nerve and 
brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and 
neuropathies, as well as mechanical and traumatic disorders which involve degeneration, 
death or trauma to neural cells or nerve tissue. In addition, the expanded stem cell 
populations can also be genetically altered for gene therapy purposes and to decrease host 

15 rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also 
be manipulated to achieve controlled differentiation of the stem cells into more 
differentiated cell types. A broadly applicable method of obtaining pure populations of a 
specific differentiated cell type from undifferentiated stem cell populations involves the 

20 use of a cell-type specific promoter driving a selectable marker. The selectable marker 
allows only cells of the desired type to survive. For example, stem cells can be induced 
to differentiate into cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); 
Klug et al., J. Clin. Invest, 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. 
W. In: Principles of Tissue Engineering eds. Lanza et al., Academic Press (1997)). 

25 Alternatively, directed differentiation of stem cells can be accomplished by culturing the 
stem cells in the presence of a differentiation factor such as retinoic acid and an 
antagonist of the polypeptide of the invention which would inhibit the effects of 
endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 

30 invention exhibits stem cell growth factor activity. Stem cells are isolated from any one 
of various cell sources (including hematopoietic stem cells and embryonic stem cells) and 
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cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 
invention to induce stem cells proliferation is determined by colony formation on semi- 
5 solid support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of 
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. 

10 Even marginal biological activity in support of colony forming cells or of 

factor-dependent cell lines indicates involvement in regulating hematopoiesis, e.g. in 
supporting the growth and proliferation of erythroid progenitor cells alone or in 
combination with other cytokines, thereby indicating utility, for example, in treating 
various anemias or for use in conjunction with irradiation/chemotherapy to stimulate the 

15 production of erythroid precursors and/or erythroid cells; in supporting the growth and 
proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to 
prevent or treat consequent myelo-suppression; in supporting the growth and proliferation 
of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in 
place of or complimentary to platelet transfusions; and/or in supporting the growth and 
proliferation of hematopoietic stem cells which are capable of maturing to any and all of 
the above-mentioned hematopoietic cells and therefore find therapeutic utility in various 
stem cell disorders (such as those usually treated with transplantation, including, without 

25 limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in 
repopulating the stem cell compartment post irradiation/chemotherapy, either in-vivo or 
ex-vivo (i.e., in conjunction with bone marrow transplantation or with peripheral 
progenitor cell transplantation (homologous or heterologous)) as normal cells or 
genetically manipulated for gene therapy. 

30 Therapeutic compositions of the invention can be used in the following: 
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Suitable assays for proliferation and differentiation of various hematopoietic lines 
are cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without 
5 limitation, those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller 
et al., Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 
81:2903-2915, 1993. 

Assays for stem cell survival and differentiation (which will identify, among 
others, proteins that regulate lympho-hematopoiesis) include, without limitation, those 

10 described in: Methylcellulose colony forming assays, Freshney, M. G. In Culture of 
Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Iiss, Inc., New 
York, N. Y. 1994; Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; 
Primitive hematopoietic colony forming cells with high proliferative potential, McNiece, 
L K. and Briddell, R. A. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol 

15 pp. 23-39, Wiley-Iiss, Inc., New York, N.Y. 1994; Neben et al., Experimental 

Hematology 22:353-359, 1994; Cobblestone area forming cell assay, Ploemacher, R. E. 
In Culture of Hematopoietic Cells. R. L Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, 
Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of stromal 
cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

20 Freshney, et al. eds. Vol pp. 163-179, Wiley-Iiss, Inc., New York, N.Y. 1994; Long term 
culture initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. L 
Freshney, et al. eds. Vol pp. 139-162, Wiley-Iiss, Inc., New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

25 A polypeptide of the present invention also may be involved in bone, cartilage, 

tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing 
and tissue repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone 
growth in circumstances where bone is not normally formed, has application in the 

30 healing of bone fractures and cartilage damage or defects in humans and other animals. 
Compositions of a polypeptide, antibody, binding partner, or other modulator of the 
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invention may have prophylactic use in closed as well as open fracture reduction and also 
in the improved fixation of artificial joints. De novo bone formation induced by an 
osteogenic agent contributes to the repair of congenital, trauma induced, or oncologic 
resection induced craniofacial defects, and also is useful in cosmetic plastic surgery. 
5 A polypeptide of this invention may also be involved in attracting bone-forming 

cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors 
of bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative 
disorders, or periodontal disease, such as through stimulation of bone and/or cartilage 
repair or by blocking inflammation or processes of tissue destruction (collagenase 

10 activity, osteoclast activity, etc.) mediated by inflammatory processes may also be 
possible using the composition of the invention. 

Another category of tissue .regeneration activity that may involve the polypeptide 
of the present invention is tendon/ligament formation. Induction of tendon/ligament-like 
tissue or other tissue formation in circumstances where such tissue is not normally 

15 formed, has application in the healing of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 
ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. 

20 De novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or 
ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention 
may provide environment to attract tendon- or ligament-forming cells, stimulate growth 

25 of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be 
useful in the treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament 
defects. The compositions may also include an appropriate matrix and/or sequestering 

30 agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 
traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
5 tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer's, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 
syndrome. Further conditions which may be treated in accordance with the present 
10 invention include mechanical and traumatic disorders, such as spinal cord disorders, head 
trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting 
from chemotherapy or other medical therapies may also be treatable using a composition 
of the invention. 

Compositions of the invention may also be useful to promote better or faster 
15 closure of non-healing wounds, including without limitation pressure ulcers, ulcers 
associated with vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
20 (including vascular endothelium) tissue, or for promoting the growth of cells comprising 
such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 
scarring may allow normal tissue to regenerate. A polypeptide of the present invention 
may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
25 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, 
and conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 
30 Therapeutic compositions of the invention can be used in the following: 
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Assays for tissue generation activity include, without limitation, those described 
in: International Patent Publication No. WO95/16035 (bone, cartilage, tendon); 
International Patent Publication No. WO95/05846 (nerve, neuronal); International Patent 
Publication No. WO91/07491 (skin, endothelium). 
5 Assays for wound healing activity include, without limitation, those described in: 

Winter, Epidermal Wound Healing, pps. 71-1 12 (Maibach, ELL and Rovee, D. T., eds.), 
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. 
Invest Dermatol 71:382-84 (1978). 

10 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or 
immune suppressing activity, including without limitation the activities for which assays 
are described herein. A polynucleotide of the invention can encode a polypeptide 
exhibiting such activities. A protein may be useful in the treatment of various immune 

15 deficiencies and disorders (including severe combined immunodeficiency (SCID)), e.g., 
in regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as 
effecting the cytolytic activity of NK cells and other cell populations. These immune 
deficiencies may be genetic or be caused by viral (e.g., HIV) as well as bacterial or 
fungal infections, or may result from autoimmune disorders. More specifically, infectious 

20 diseases causes by viral, bacterial, fungal or other infection may be treatable using a 
protein of the present invention, including infections by HIV, hepatitis viruses, herpes 
viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be 
useful where a boost to the immune system generally may be desirable, i.e., in the 

25 treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present 
invention include, for example, connective tissue disease, multiple sclerosis, systemic 
lupus erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, 
Guillain-Barre syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, 

30 myasthenia gravis, graft-versus-host disease and autoimmune inflammatory eye disease. 
Such a protein (or antagonists thereof, including antibodies) of the present invention may 
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also to be useful in the treatment of allergic reactions and conditions (e.g., anaphylaxis, 
serum sickness, drug reactions, food allergies, insect venom allergies, mastocytosis, 
allergic rhinitis, hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic 
dermatitis, allergic contact dermatitis, eiythema multiforme, Stevens-Johnson syndrome, 

5 allergic conjunctivitis, atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant 
papillary conjunctivitis and contact allergies), such as asthma (particularly allergic 
asthma) or other respiratory problems. Other conditions, in which immune suppression is 
desired (including, for example, organ transplantation), may also be treatable using a 
protein (or antagonists thereof) of the present invention. The therapeutic effects of the 

10 polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo 
animals models such as the cumulative contact enhancement test (Lastbom et aL, 
Toxicology 125: 59-66, 1998), skin prick test (Hoffmann et aL, Allergy 54: 446-54, 
1999), guinea pig skin sensitization test (Vohr et al., Arch. Toxocol. 73: 501-9), and 
murine local lymph node assay (Kimber et al., J. Toxicol. Environ. Health 53: 563-79). 

15 Using the proteins of the invention it may also be possible to modulate immune 

responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the 
induction of an immune response. The functions of activated T cells may be inhibited by 
suppressing T cell responses or by inducing specific tolerance in T cells, or both. 

20 Immunosuppression of T cell responses is generally an active, non-antigen-specific, 
process which requires continuous exposure of the T cells to the suppressive agent 
Tolerance, which involves inducing non-responsiveness or anergy in T cells, is 
distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 

25 demonstrated by the lack of a T cell response upon reexposure to specific antigen in the 
absence of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing 
high level lymphokine synthesis by activated T cells, will be useful in situations of tissue, 

30 skin and organ transplantation and in graft-versus-host disease (GVHD). For example, 
blockage of T cell function should result in reduced tissue destruction in tissue 
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transplantation. Typically, in tissue transplants, rejection of the transplant is initiated 
through its recognition as foreign by T cells, followed by an immune reaction that 
destroys the transplant. The administration of a therapeutic composition of the invention 
may prevent cytokine synthesis by immune cells, such as T cells, and thus acts as an 
5 immunosuppressant. Moreover, a lack of costimulation may also be sufficient to anergize 
the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance by B 
lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a 
subject, it may also be necessary to block the function of a combination of B lymphocyte 
10 antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 

15 used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et aL, Science 257:789-792 (1992) and Turka et al., Proc. Natl. 
Acad. Sci USA, 89:11102-11105 (1992). In addition, murine models of GVHD (see Paul 
ed„ Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used 
to determine the effect of therapeutic compositions of the invention on the development 

20 of that disease. 

Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate 
activation of T cells that are reactive against self tissue and which promote the production 
of cytokines and autoantibodies involved in the pathology of the diseases. Preventing the 

25 activation of autoreactive T cells may reduce or eliminate disease symptoms. 

Administration of reagents which block stimulation of T cells can be used to inhibit T 
cell activation and prevent production of autoantibodies or T cell-derived cytokines 
which may be involved in the disease process. Additionally, blocking reagents may 
induce antigen-specific tolerance of autoreactive T cells which could lead to long-term 

30 relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal 
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models of human autoimmune diseases. Examples include murine experimental 
autoimmune encephalitis, systemic lupus erythmatosis in MRL/lprflpr mice or NZB 
hybrid mice, murine autoimmune collagen arthritis, diabetes mellitus in NOD mice and 
BB rats, and murine experimental myasthenia gravis (see Paul ed., Fundamental 
5 Immunology, Raven Press, New York, 1989, pp. 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or 
eliciting an initial immune response. For example, enhancing an immune response may 

10 be useful in cases of viral infection, including systemic viral diseases such as influenza, 
the common cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient 
by removing T cells from the patient, costimulating the T cells in vitro with viral 
antigen-pulsed APCs either expressing a peptide of the present invention or together with 

15 a stimulatory form of a soluble peptide of the present invention and reintroducing the in 
vitro activated T cells into the patient Another method of enhancing anti-viral immune 
responses would be to isolate infected cells from a patient, transfect them with a nucleic 
acid encoding a protein of the present invention as described herein such that the cells 
express all or a portion of the protein on their surface, and reintroduce the transfected 

20 cells into the patient. Hie infected cells would now be capable of delivering a 
costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation 
signal to T cells to induce a T cell mediated immune response against the transfected 
tumor cells. In addition, tumor cells which lack MHC class I or MHC class II molecules, 

25 or which fail to reexpress sufficient mounts of MHC class I or MHC class II molecules, 
can be transfected with nucleic acid encoding all or a portion of (e.g., a 
cytoplasmic-domain truncated portion) of an MHC class I alpha chain protein and 02 
microglobulin protein or an MHC class II alpha chain protein and an MHC class II beta 
chain protein to thereby express MHC class I or MHC class II proteins on the cell 

30 surface. Expression of the appropriate class I or class II MHC in conjunction with a 

peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a 
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T cell mediated immune response against the transfected tumor cell. Optionally, a gene 
encoding an antisense construct which blocks expression of an MHC class E associated 
protein, such as the invariant chain, can also be cotransfected with a DNA encoding a 
peptide having the activity of a B lymphocyte antigen to promote presentation of tumor 
5 associated antigens and induce tumor specific immunity. Thus, the induction of a T cell 
mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured 
by the following methods: 

10 Suitable assays for thymocyte or splenocyte cytotoxicity include, without 

limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 
Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. 

15 Natl. Acad. Sci. USA 78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 
1982; Handa et al., J. Immunol. 135:1564-1572, 1985; Takai et al., L Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bowman et al., J. 
Virology 61:1992-1998; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; 
Brown et al., J. Immunol. 153:3079-3092, 1994. 

20 Assays for T-ceil-dependent immunoglobulin responses and isotype switching 

(which will identify, among others, proteins that modulate T-cell dependent antibody 
responses and that affect Thl/Th2 profiles) include, without limitation, those described 
in: Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In 
vitro antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in 

25 Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, 
Toronto. 1994. 

Mixed lymphocyte reaction (MIA) assays (which will identify, among others, 
proteins that generate predominantly Till and CTL responses) include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. 
30 Kruisbeek, D. HL Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 

Associates and Wiley-Ihterscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
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Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai et aL f J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol 140:508-512, 1988; Bertagnolli et al., J. 
Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
5 expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of 
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 
154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 

10 264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et 
al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, 
proteins that prevent apoptosis after superantigen induction and proteins tfiat regulate 

15 lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz 
et al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; 
Gorczyca et al., Cancer Research 53:1945-1951, 1993; ftoh et al., Cell 66:233-243, 1991; 
Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 
14:891-897, 1993; Gorczyca et al., International Journal of Oncology 1:639-648, 1992. 

20 Assays for proteins that influence early steps of T-cell commitment and 

development include, without limitation, those described in: Antica et al., Blood 
84:111-117, 1994; Fine et al., Cellular Immunology 155:111-122, 1994; Galy et al., 
Blood 85:2770-2778, 1995; Told et al., Proc. Nat Acad Sci. USA 88:7548-7551, 1991. 

25 4.10.8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to 
30 stimulate the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the 
present invention, alone or in heterodimers with a member of the inhibin family, may be 
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useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. Administration of sufficient 
amounts of other inhibins can induce infertility in these mammals. Alternatively, the 
polypeptide of the invention, as a homodimer or as a heterodimer with other protein 
5 subunits of the inhibin group, may be useful as a fertility inducing therapeutic, based 
upon the ability of activin molecules in stimulating FSH release from cells of the anterior 
pituitary. See, for example, U.S. Pat. No. 4,798,885. A polypeptide of the invention may 
also be useful for advancement of the onset of fertility in sexually immature mammals, so 
as to increase the lifetime reproductive performance of domestic animals such as, but not 

10 limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be 
measured by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
Vale et al., Endocrinology 91:562-572, 1972; ling et aL, Nature 321:779-782, 1986; Vale 

15 et al., Nature 321:776-779, 1986; Mason et al., Nature 3 18:659-663, 1985; Forage et al., 
Proc. Natl. Acad. ScL USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or 
20 chemokinetic activity for mammalian cells, including, for example, monocytes, 

fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Chemotactic and chemokinetic receptor activation can be used to mobilize or 
attract a desired cell population to a desired site of action. Chemotactic or chemokinetic 
25 compositions (e.g. proteins, antibodies, binding partners, or modulators of the invention) 
provide particular advantages in treatment of wounds and other trauma to tissues, as well 
as in treatment of localized infections. For example, attraction of lymphocytes, 
monocytes or neutrophils to tumors or sites of infection may result in improved immune 
responses against the tumor or infecting agent. 
30 A protein or peptide has chemotactic activity for a particular cell population if it 

can stimulate, directly or indirectly, the directed orientation or movement of such cell 
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population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population 
of cells can be readily determined by employing such protein or peptide in any known 
assay for cell chemotaxis. 

5 Therapeutic compositions of the invention can be used in the following: 

Assays for chemotactic activity (which will identify proteins that induce or 
prevent chemotaxis) consist of assays that measure the ability of a protein to induce the 
migration of cells across a membrane as well as the ability of a protein to induce the 
adhesion of one cell population to another cell population. Suitable assays for movement 
10 and adhesion include, without limitation, those described in: Current Protocols in 

Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. 
Strober, Pub. Greene Publishing Associates and Wiley-Ihterscience (Chapter 6.12, 
Measurement of alpha and beta Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 
95:1370-1376, 1995; lind et al. APMES 103:140-146, 1995; Muller et al Eur. J. 
15 Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 1994; Johnston et 
al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or 
thrombolysis or thrombosis. A polynucleotide of the invention can encode a polypeptide 
exhibiting such attributes. Compositions may be useful in treatment of various 
coagulation disorders (including hereditary disorders, such as hemophilias) or to enhance 
coagulation and other hemostatic events in treating wounds resulting from trauma, 
surgery or other causes. A composition of the invention may also be useful for dissolving 
or inhibiting formation of thromboses and for treatment and prevention of conditions 
resulting therefrom (such as, for example, infarction of cardiac and central nervous 
system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., 
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Thrombosis Res. 45:413-419, 1987; Humphrey et ah, Fibrinolysis 5:71-79 (1991); 
Schaub, Prostaglandins 35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, 

proliferation or metastasis. Detection of the presence or amount of polynucleotides or 
polypeptides of the invention may be useful for the diagnosis and/or prognosis of one or 
more types of cancer. For example, the presence or increased expression of a 
polynucleotide/polypeptide of the invention may indicate a hereditary risk of cancer, a 

10 precancerous condition, or an ongoing malignancy. Conversely, a defect in the gene or 
absence of the polypeptide may be associated with a cancer condition. Identification of 
single nucleotide polymorphisms associated with cancer or a predisposition to cancer 
may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell 

15 proliferation, inhibiting angiogenesis (growth of new blood vessels that is necessary to 
support tumor growth) and/or prohibiting metastasis by reducing tumor cell motility or 
invasiveness. Therapeutic compositions of the invention may be effective in adult and 
pediatric oncology including in solid phase tumors/malignancies, locally advanced 
tumors, human soft tissue sarcomas, metastatic cancer, including lymphatic metastases, 

20 blood cell malignancies including multiple myeloma, acute and chronic leukemias, and 
lymphomas, head and neck cancers including mouth cancer, larynx cancer and thyroid 
cancer, lung cancers including small cell carcinoma and non-small cell cancers, breast 
cancers including small cell carcinoma and ductal carcinoma, gastrointestinal cancers 
including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

25 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers 
including bladder cancer and prostate cancer, malignancies of the female genital tract 
including ovarian carcinoma, uterine (including endometrial) cancers, and solid tumor in 
the ovarian follicle, kidney cancers including renal cell carcinoma, brain cancers 
including intrinsic brain tumors, neuroblastoma, astrocytic brain tumors, gliomas, 

30 metastatic tumor cell invasion in the central nervous system, bone cancers including 

osteomas, skin cancers including malignant melanoma, tumor progression of human skin 
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keratinocytes, squamous cell carcinoma, basal cell carcinoma, hemangiopericytoma and 
Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
(including inhibitors and stimulators of the biological activity of the polypeptide of the 
5 invention) may be administered to treat cancer. Therapeutic compositions can be 

administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser 
therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of 
tumor growth, inhibiting metastasis, or otherwise improving overall clinical condition, 

10 without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as 
a portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the 
polypeptide or modulator of the invention with one or more anti-cancer drugs in addition 
to a pharmaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as 

15 a cancer treatment is routine. Anti-cancer drugs that are well known in the art and can be 
used as a treatment in combination with the polypeptide or modulator of the invention 
include: Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, 
Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, 
Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HC1, 

20 Doxorubicin HCl,Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5- 
Fhiorouracil (5-Fu), Hutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon 
Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HG1 (nitrogen mustard), Melphalan, Mercaptopurine, 
Mesna, Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, 

25 Procaibazine HC1, Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine 
sulfate, Vincristine sulfate, Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, 
Mitoguazone, Pentostatin, Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for 
prophylactic treatment of cancer. There are hereditary conditions and/or environmental 

30 situations (e.g. exposure to carcinogens) known in the art that predispose an individual to 
developing cancers. Under these circumstances, it may be beneficial to treat these 
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individuals with therapeutically effective doses of the polypeptide of the invention to 
reduce the risk of developing cancers. 

In vitro models can be used to detennine the effective doses of the polypeptide of 
the invention as a potential cancer treatment. These in vitro models include proliferation 
5 assays of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, 
(1987) Culture of Animal Cells: A Manual of Basic Technique, Wily-Iiss, New York, 
NY Ch 18 and Ch 21), tumor systems in nude mice as described in Giovanella et ah, J. 
Natl. Can. Inst, 52; 921-30 (1974), mobility and invasive potential of tumor cells in 
Boyden Chamber assays as described in Pilkington et al., Anticancer Res., 17: 4107-9 
10 (1997), and angiogenesis assays such as induction of vascularization of the chick 

chorioallantoic membrane or induction of vascular endothelial cell migration as described 
in Ribatta et al., Intl. J. Dev. Biol., 40: 1189-97 (1999) and Li et al., Clin. Exp. 
Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, e.g. 
from American Type Tissue Culture Collection catalogs. 

15 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/hgand interactions. A polynucleotide 
of the invention can encode a polypeptide exhibiting such characteristics. Examples of 

20 such receptors and ligands include, without limitation, cytokine receptors and their 
ligands, receptor kinases and their ligands, receptor phosphatases and their ligands, 
receptors involved in cell-cell interactions and their ligands (including without limitation, 
cellular adhesion molecules (such as selectins, integrins and their ligands) and 
receptor/ligand pairs involved in antigen presentation, antigen recognition and 

25 development of cellular and humoral immune responses. Receptors and ligands are also 
useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without 
limitation, fragments of receptors and ligands) may themselves be useful as inhibitors of 
receptor/ligand interactions. 

30 The activity of a polypeptide of the invention may, among other means, be 

measured by the following methods: 
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Suitable assays for receptor-ligand activity include without limitation those 
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley- Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static 
5 conditions 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad Sci. USA 84:6864-6868, 1987; 
Bierer et al., J. Exp. Med. 168:1145-1156, 1988; Rosenstein et aL, J. Exp. Med. 
169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., 
Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor 

10 for a ligand(s) thereby transmitting the biological activity of that ligand(s). ligands may 
be identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists 
or a partial antagonist require the use of other proteins as competing ligands. The 

15 polypeptides of the present invention or ligand(s) thereof may be labeled by being 

coupled to radioisotopes, colorimetric molecules or a toxin molecules by conventional 
methods. ("Guide to Protein Purification ,, Murray P. Deutscher (ed) Methods in 
Enzymology Vol. 182 (1990) Academic Press, Inc. San Diego). Examples of 
radioisotopes include, but are not limited to, tritium and caibon-14 . Examples of 

20 colorimetric molecules include, but are not limited to, fluorescent molecules such as 
fluorescamine, or rhodamine or other colorimetric molecules. Examples of toxins 
include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

25 This invention is particularly useful for screening chemical compounds by using 

the novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 
method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 

30 transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 
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Such cells, either in viable or fixed form, can be used for standard binding assays. One 
may measure, for example, the formation of complexes between polypeptides of the 
invention or fragments and the agent being tested or examine the diminution in complex 
formation between the novel polypeptides and an appropriate cell line, which are well 
5 known in the art. 

Sources for test compounds that may be screened for ability to bind to or 
modulate (i.e., increase or decrease) the activity of polypeptides of the invention include 
(1) inorganic and organic chemical libraries, (2) natural product libraries, and (3) 
combinatorial libraries comprised of either random or mimetic peptides, oligonucleotides 

10 or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or 
compounds that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria 

15 and fungi), animals, plants or other vegetation, or marine organisms, and libraries of 

mixtures for screening may be created by: (1) fermentation and extraction of broths from 
soil, plant or marine microorganisms or (2) extraction of the organisms themselves. 
Natural product libraries include polyketides, non-ribosomal peptides, and (non-naturally 
occurring) variants thereof. For a review, see Science 282:63-6% (1998). 

20 Combinatorial libraries are composed of large numbers of peptides, 

oligonucleotides or organic compounds and can be readily prepared by traditional 
automated synthesis methods, PCR, cloning or proprietary synthetic methods. Of 
particular interest are peptide and oligonucleotide combinatorial libraries. Still other 
libraries of interest include peptide, protein, peptidomimetic, multiparallel synthetic 

25 collection, recombinatorial, and polypeptide libraries. For a review of combinatorial 
chemistry and libraries created therefrom, see Myers, Curr. Opiru Biotechnol 8:701-707 
(1997). For reviews and examples of peptidomimetic libraries, see Al-Obeidi et al., Mol 
Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol 1(1): 114-19 (1997); 
Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

30 Identification of modulators through use of the various libraries described herein 

permits modification of the candidate "hit" (or "lead") to optimize the capacity of the 
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"hit" to bind a polypeptide of the invention. The molecules identified in the binding assay 
are then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 

5 the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin 
or cholera, or with other compounds that are toxic to cells such as radioisotopes. The 
toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity 
of the binding molecule for a polypeptide of the invention. Alternatively, the binding 

10 molecules may be complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide 
e.g. a ligand or a receptor. The art provides numerous assays particularly useful for 

15 identifying previously unknown binding partners for receptor polypeptides of the 
invention. For example, expression cloning using mammalian or bacterial cells, or 
dihybrid screening assays can be used to identify polynucleotides encoding binding 
partners. As another example, affinity chromatography with the appropriate immobilized 
polypeptide of the invention can be used to isolate polypeptides that recognize and bind 

20 polypeptides of the invention. There are a number of different libraries used for the 
identification of compounds, and in particular small molecules, that modulate 
increase or decrease) biological activity of a polypeptide of the invention. Ligands for 
receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical 

25 except for the expression of the receptor of the invention: one cell population expresses 
the receptor of the invention whereas the other does not Hie response of the two cell 
populations to the addition of ligands(s) are then compared. Alternatively, an expression 
library can be co-expressed with the polypeptide of the invention in cells and assayed for 
an autocrine response to identify potential ligand(s). As still another example, BIAcore 

30 assays, gel overlay assays, or other methods known in the art can be used to identify 

binding partner polypeptides, including, (1) organic and inorganic chemical libraries, (2) 
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natural product libraries, and (3) combinatorial libraries comprised of random peptides, 
oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade 
of the polypeptide of the invention can be determined. For example, a chimeric protein in 
5 which the cytoplasmic domain of the polypeptide of the invention is fused to the 

extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
10 phosphorylation. Other methods known to those in the art can also be used to identify 
signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORYACTmTY 

Compositions of the present invention may also exhibit anti-inflammatory 

15 activity. The anti-inflammatory activity may be achieved by providing a stimulus to cells 
involved in the inflammatory response, by inhibiting or promoting cell-cell interactions 
(such as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells 
involved in the inflammatory process, inhibiting or promoting cell extravasation, or by 
stimulating or suppressing production of other factors which more directly inhibit or 

20 promote an inflammatory response. Compositions with such activities can be used to treat 
inflammatory conditions including chronic or acute conditions), including without 
limitation intimation associated with infection (such as septic shock, sepsis or systemic 
inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, endotoxin 
lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 

25 chemokine-induced lung injury, inflammatory bowel disease, Crohn f s disease or resulting 
from over production of cytokines such as TNF or IL-L Compositions of the invention 
may also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or 
material. Compositions of this invention may be utilized to prevent or treat conditions 
such as, but not limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced 

30 shock, rheumatoid arthritis, chronic inflammatory arthritis, pancreatic cell damage from 
diabetes mellitus type 1 , graft versus host disease, inflammatory bowel disease, 
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inflamation associated with pulmonary disease, other autoimmune disease or 
inflammatory disease, an antiproliferative agent such as for acute or chronic mylegenous 
leukemia or in the prevention of premature labor secondary to intrauterine infections. 

5 4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of 
a therapeutic that promotes or inhibits function of the polynucleotides and/or 
polypeptides of the invention. Such leukemias and related disorders include but are not 
limited to acute leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, 
10 myeloblastic, promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic 

leukemia, chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia 
(for a review of such disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. 
lippincott Co., Philadelphia). 

15 4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication 
of therapeutic utility, include but are not limited to nervous system injuries, and diseases 

20 or disorders which result in either a disconnection of axons, a diminution or degeneration 
of neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention 
include but are not limited to the following lesions of either the central (including spinal 
cord, brain) or peripheral nervous systems: 

25 (i) traumatic lesions, including lesions caused by physical injury or associated 

with surgery, for example, lesions which sever a portion of the nervous system, or 
compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous 
system results in neuronal injury or death, including cerebral infarction or ischemia, or 

30 spinal cord infarction or ischemia; 
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(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by 
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 

5 (iv) degenerative lesions, in which a portion of the nervous system is destroyed 

or injured as a result of a degenerative process including but not limited to degeneration 
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion 
10 of the nervous system is destroyed or injured by a nutritional disorder or disorder of 

metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 
Wernicke disease, tobacco-alcohol amblyopia, Marchiaf ava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
15 limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, 

carcinoma, or sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is 

20 destroyed or injured by a demyelinating disease including but not limited to multiple 
sclerosis, human immunodeficiency virus-associated myelopathy, transverse myelopathy 
or various etiologies, progressive multifocal leukoencephalopathy, and central pontine 
myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a 
25 nervous system disorder may be selected by testing for biological activity in promoting 
the survival or differentiation of neurons. For example, and not by way of limitation, 
therapeutics which elicit any of the following effects may be useful according to the 
invention: 

(i) increased survival time of neurons in culture; 
30 (ii) increased sprouting of neurons in culture or in vivo; 
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(iii) increased production of a neuron-associated molecule in culture or in vivo, 
e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the art. In preferred, 
5 non-limiting embodiments, increased survival of neurons may be measured by the 

method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting 
of neurons may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 
70:65-82) or Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
10 binding, Northern blot assay, etc., depending on the molecule to be measured; and motor 
neuron dysfunction may be measured by assessing the physical manifestation of motor 
neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional 
disability. 

In specific embodiments, motor neuron disorders that may be treated according to 
15 the invention include but are not limited to disorders such as infarction, infection, 

exposure to toxin, trauma, surgical damage, degenerative disease or malignancy that may 
affect motor neurons as well as other components of the nervous system, as well as 
disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and 
including but not limited to progressive spinal muscular atrophy, progressive bulbar 
20 palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive 
bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and the post polio 
syndrome, and Hereditary Motorsensory Neuropathy (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

25 A polypeptide of the invention may also exhibit one or more of the following 

additional activities or effects: inhibiting the growth, infection or function of, or killing, 
infectious agents, including, without limitation, bacteria, viruses, fungi and other 
parasites; effecting (suppressing or enhancing) bodily characteristics, including, without 
limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue 

30 pigmentation, or organ or body part size or shape (such as, for example, breast 

augmentation or diminution, change in bone form or shape); effecting biorhythms or 
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circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting 
the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 
dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other nutritional 
factors or components); effecting behavioral characteristics, including, without 
5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or 
other pain reducing effects; promoting differentiation and growth of embryonic stem cells 
in lineages other than hematopoietic lineages; hormonal or endocrine activity; in the case 
of enzymes, correcting deficiencies of the enzyme and treating deficiency-related 
10 diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); 
immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the ability to act as an antigen in a vaccine composition to raise an 
immune response against such protein or another material or entity which is 
cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and 
this genetic information can be used to tailor preventive or therapeutic treatment 
appropriately. For example, the existence of a polymorphism associated with a 
predisposition to inflammation or autoimmune disease makes possible the diagnosis of 

25 this condition in humans by identifying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence 
of the polymorphism in the DNA. For example, PCR may be used to amplify an 

30 appropriate fragment of genomic DNA which may then be sequenced. Alternatively, the 
DNA may be subjected to allele-specific oligonucleotide hybridization (in which 
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appropriate oligonucleotides are hybridized to the DNA under conditions permitting 
detection of a single base mismatch) or to a single nucleotide extension assay (in which 
an oligonucleotide that hybridizes immediately adjacent to the position of the 
polymorphism is extended with one or more labeled nucleotides). In addition, traditional 
restriction fragment length polymorphism analysis (using restriction enzymes that 
provide differential digestion of the genomic DNA depending on the presence or absence 
of the polymorphism) may be performed Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences 
of the present invention. In the alternative, any one of the nucleotide sequences of the 
present invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence 
could also be detected by detecting a corresponding change in amino acid sequence of the 
protein, e.g., by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against 
rheumatoid arthritis is determined in an experimental animal model system. Hie 
experimental model system is adjuvant induced arthritis in rats, and the protocol is 
20 described by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963, 
Int. Arch. Allergy Appl. Immunol., 23: 129. Induction of the disease can be caused by a 
single injection, generally intradermally, of a suspension of killed Mycobacterium 
tuberculosis in complete Freund's adjuvant (CFA). The route of injection can vary, but 
rats may be injected at the base of the tail with an adjuvant mixture. The polypeptide is 
25 administered in phosphate buffered solution (PBS) at a dose of about 1-5 mg/kg. The 
control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of 
intradermally injecting killed Mycobacterium tuberculosis in CPA followed by 
immediately administering the test compound and subsequent treatment every other day 
30 until day 24. At 14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an 
overall arthritis score may be obtained as described by J. Holoskitz above. An analysis of 
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the data would reveal that the test compound would have a dramatic affect on the 
swelling of the joints as measured by a decrease of the arthritis score. 

5 4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and 
antibodies or other binding partners or modulators including antisense polynucleotides) 
of the invention have numerous applications in a variety of therapeutic methods. 
Examples of therapeutic applications include, but are not limited to, those exemplified 
10 herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of 
the polypeptides or other composition of the invention to individuals affected by a 

15 disease or disorder that can be modulated by regulating the peptides of the invention. 
While the mode of administration is not particularly important, parenteral administration 
is preferred. *An exemplary mode of administration is to deliver an intravenous bolus. 
The dosage of the polypeptides or other composition of the invention will normally be 
determined by the prescribing physician. It is to be expected that the dosage will vary 

20 according to the age, weight, condition and response of the individual patient Typically, 
the amount of polypeptide administered per dose will be in the range of about 0.01|xg/kg 
to 100 mg/kg of body weight, with the preferred dose being about O.lfxg/kg to 10 mg/kg 
of patient body weight For parenteral administration, polypeptides of the invention will 
be formulated in an injectable form combined with a pharmaceutically acceptable 

25 parenteral vehicle. Such vehicles are well known in the art and examples include water, 
saline, Ringer's solution, dextrose solution, and solutions consisting of small amounts of 
the human serum albumin. The vehicle may contain minor amounts of additives that 
maintain the isotonicity and stability of the polypeptide or other active ingredient The 
preparation of such solutions is within the skill of the art. 

30 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
5 including antibodies and other binding partners of the polypeptides of the invention) may 
be administered to a patient in need, by itself, or in pharmaceutical compositions where it 
is mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other 
active ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and 

10 other materials well known in the art. The term "pharmaceutical^ acceptable" means a 
non-toxic material that does not interfere with the effectiveness of the biological activity 
of the active ingredients). The characteristics of the carrier will depend on the route of 
administration. The pharmaceutical composition of the invention may also contain 
cytokines, lymphokihes, or other hematopoietic factors such as M-CSF, GM-CSF, TNF, 

15 EL-1, 1L-2, IL-3, IL-4, IL-5, IL-6, IL-7, 1L-8, IL-9, IL-10, IL-11, IL-12, DL-13, DL-14, 
IH5, IFN, TNFO, TNF1 , TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell factor, 
and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. Hiese 
agents include various growth factors such as epidermal growth factor (EGF), 

20 platelet-derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-0), 
insulin-like growth factor (IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 
use in treatment. Such additional factors and/or agents may be included in the 

25 pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other 
active ingredient of the present invention may be included in formulations of the 
particular clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic 
or anti-thrombotic factor, or and- inflammatory agent to minimize side effects of the 

30 clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or 

anti-thrombotic factor, or anti-inflammatory agent (such as IL-IRa, IL-1 Hyl, IL-1 Hy2, 
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anti-INF, corticosteroids, immunosuppressive agents). A protein of the present 
invention may be active in multimers (e.g., heterodimers or homodimers) or complexes 
with itself or other proteins. As a result, pharmaceutical compositions of the invention 
may comprise a protein of the invention in such multimeric or complexed form. 
5 As an alternative to being included in a pharmaceutical composition of the 

invention including a first protein, a second protein or a therapeutic agent may be 
concurrently administered with the first protein (e.g., at the same time, or at differing 
times provided that therapeutic concentrations of the combination of agents is achieved at 
the treatment site). Techniques for formulation and administration of the compounds of 

10 the instant application may be found in "Remington's Pharmaceutical Sciences," Mack 
Publishing Co., Easton, PA, latest edition. A therapeutically effective dose further refers 
to that amount of the compound sufficient to result in amelioration of symptoms, e.g., 
treatment, healing, prevention or amelioration of the relevant medical condition, or an 
increase in rate of treatment, healing, prevention or amelioration of such conditions. 

15 When applied to an individual active ingredient, administered alone, a therapeutically 
effective dose refers to that ingredient alone. When applied to a combination, a 
therapeutically effective dose refers to combined amounts of the active ingredients that 
result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

20 In practicing the method of treatment or use of the present invention, a 

therapeutically effective amount of protein or other active ingredient of the present 
invention is administered to a mammal having a condition to be treated. Protein or other 
active ingredient of the present invention may be administered in accordance with the 
method of the invention either alone or in combination with other therapies such as 

25 treatments employing cytokines, lymphokines or other hematopoietic factors. When co- 
administered with one or more cytokines, lymphokines or other hematopoietic factors, 
protein or other active ingredient of the present invention may be administered either 
simultaneously with the cytokine(s), lymphokine(s), other hematopoietic factors), 
thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, the 

30 attending physician will decide on the appropriate sequence of administering protein or 
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other active ingredient of the present invention in combination with cytokine(s), 
lymphokine(s), other hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

5 - . .Suitable routes.of administration may, for example, include oral, rectal, 

transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 
subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 
intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of 
protein or other active ingredient of the present invention used in the pharmaceutical 

10 composition or to practice the method of the present invention can be carried out in a 
variety of conventional ways, such as oral ingestion, inhalation, topical application or 
cutaneous, subcutaneous, intraperitoneal, parenteral or intravenous injection. Intravenous 
administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic 

15 manner, for example, via injection of the compound directly into a arthritic joints or in 
fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery, the 
compounds may be administered topically, for example, as eye drops. Furthermore, one 
may administer the drug in a targeted drug delivery system, for example, in a liposome 

20 coated with a specific antibody, targeting, for example, arthritic or fibrotic tissue. The 
liposomes will be targeted to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of 

25 skill in the art Preferably for wound treatment, one administers the therapeutic 
compound directly to the site. Suitable dosage ranges for the polypeptides of the 
invention can be extrapolated from these dosages or from similar studies in appropriate 
animal models. Dosages can then be adjusted as necessary by the clinician to provide 
maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 
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Pharmaceutical compositions for use in. accordance with the present invention 
thus may be formulated in a conventional manner using one or more physiologically 
acceptable carriers comprising excipients and auxiliaries which facilitate processing of 
the active compounds into preparations which can be used pharmaceutical^. These 
5 pharmaceutical compositions may be manufactured in a manner that is itself known, e.g. , 
by means of conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping or lyophilizing processes. Proper formulation is 
dependent upon the route of administration chosen. When a therapeutically effective 
amount of protein or other active ingredient of the present invention is administered 

10 orally, protein or other active ingredient of the present invention will be in the form of a 
tablet, capsule, powder, solution or elixir. When administered in tablet form, the 
pharmaceutical composition of the invention may additionally contain a solid carrier such 
as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% 
protein or other active ingredient of the present invention, and preferably from about 25 

15 to 90% protein or other active ingredient of the present invention. When administered in 
liquid form, a liquid carrier such as water, petroleum, oils of animal or plant origin such 
as peanut oil, mineral oil, soybean oil, or sesame oil, or synthetic oils may be added The 
liquid form of the pharmaceutical composition may further contain physiological saline 
solution, dextrose or other saccharide solution, or glycols such as ethylene glycol, 

20 propylene glycol or polyethylene glycol. When administered in liquid form, the 

pharmaceutical composition contains from about 0.5 to 90% by weight of protein or other 
active ingredient of the present invention, and preferably from about 1 to 50% protein or 
other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of 

25 the present invention is administered by intravenous, cutaneous or subcutaneous 

injection, protein or other active ingredient of the present invention will be in the form of 
a pyrogen-free, parenterally acceptable aqueous solution. The preparation of such 
parenterally acceptable protein or other active ingredient solutions, having due regard to 
pH, isotonicity, stability, and the like, is within the skill in the art. A preferred 

30 pharmaceutical composition for intravenous, cutaneous, or subcutaneous injection should 
contain, in addition to protein or other active ingredient of the present invention, an 
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isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, Dextrose 
Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or other 
vehicle as known in the art The pharmaceutical composition of the present invention 
may also contain stabilizers, preservatives, buffers, antioxidants, or other additives 
5 known to those of skill in the art. For injection, the agents of the invention may be 

formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks's solution, Ringer's solution, or physiological saline buffer. For transmucosal 
administration, penetrants appropriate to the barrier to be permeated are used in the 
formulation. Such penetrants are generally known in the art 

10 For oral administration, the compounds can be formulated readily by combining 

the active compounds with pharmaceutical^ acceptable carriers well known in the art. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral 
ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be 

15 obtained from a solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or 
dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize 
starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, 

20 hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 

polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium 
alginate. Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, 

25 polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may 
be added to the tablets or dragee coatings for identification or to characterize different 
combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules 

30 made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as 
glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture 
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with filler such as lactose, binders such as starches, and/or lubricants such as talc or 
magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds 
may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or 
liquid polyethylene glycols. In addition, stabilizers may be added. All formulations for 
5 oral administration should be in dosages suitable for such administration. For buccal 
administration, the compositions may take the form of tablets or lozenges formulated in 
conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 

10 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon 
dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be 
determined by providing a valve to deliver a metered amount. Capsules and cartridges 
of, e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder 

15 mix of the compound and a suitable powder base such as lactose or starch. The 

compounds may be formulated for parenteral administration by injection, e.g., by bolus 
injection or continuous infusion. Formulations for injection may be presented in unit 
dosage form, e.g. 9 in ampules or in multi-dose containers, with an added preservative. 
The compositions may take such forms as suspensions, solutions or emulsions in oily or 

20 aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing 
and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active compounds in water-soluble form. Additionally, suspensions of 
the active compounds may be prepared as appropriate oily injection suspensions. 

25 Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic 
fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection 
suspensions may contain substances which increase the viscosity of the suspension, such 
as sodium caiboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may 
also contain suitable stabilizers or agents which increase the solubility of the compounds 

30 to allow for the preparation of highly concentrated solutions. Alternatively, the active 
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ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile 
pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as 
suppositories or retention enemas, e.g., containing conventional suppository bases such as 
5 - cocoa butter-orother glycerides.In addition to the formulations described previously, the 
compounds may also be formulated as a depot preparation. Such long acting 
formulations may be administered by implantation (for example subcutaneously or 
intramuscularly) or by intramuscular injection. Thus, for example, the compounds may 
be formulated with suitable polymeric or hydrophobic materials (for example as an 

10 emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, 
for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible 
organic polymer, and an aqueous phase. The co-solvent system may be the VPD 

15 co-solvent system. VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar 
surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in 
absolute ethanol. The VPD co-solvent system (VPD:5W) consists of VPD diluted 1:1 
with a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic 
compounds well, and itself produces low toxicity upon systemic administration. 

20 Naturally, the proportions of a co-solvent system may be varied considerably without 
destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other lo w-toxicity nonpolar 
surfactants may be used instead of polysorbate 80; the fraction size of polyethylene 
glycol may be varied; other biocompatible polymers may replace polyethylene glycol, 

25 e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical 
compounds may be employed. Liposomes and emulsions are well known examples of 
delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as 
dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 

30 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
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Various types of sustained-release materials have beefa established and are well known by 
those skilled in the art. Sustained-release capsules may, depending on their chemical 
nature, release the compounds for a few weeks up to over 100 days. Depending on the 
chemical nature and the biological stability of the therapeutic reagent, additional 
5 strategies for protein or other active ingredient stabilization may be employed 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited 
to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 
gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 

10 invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 

15 potassium benzoate, triethanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a 
complex of the protein(s) or other active ingredient(s) of present invention along with 
protein or peptide antigens. The protein and/or peptide antigen will deliver a stimulatory 
signal to both B and T lymphocytes. B lymphocytes will respond to antigen through their 

20 surface immunoglobulin receptor. T lymphocytes will respond to antigen through the T 
cell receptor (TCR) following presentation of the antigen by MHC proteins. MHC and 
structurally related proteins including those encoded by class I and class U MHC genes 
on host cells will serve to present the peptide antigen(s) to T lymphocytes. The antigen 
components could also be supplied as purified MHC-peptide complexes alone or with 

25 co-stimulatory molecules that can directly signal T cells. Alternatively antibodies able to 
bind surface immunoglobulin and other molecules on B cells as well as antibodies able to 
bind the TCR and other molecules on T cells can be combined with the pharmaceutical 
composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a 

30 liposome in which protein of the present invention is combined, in addition to other 

pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist 
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in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers 
in aqueous solution. Suitable lipids for liposomal formulation include, without limitation, 
monoglycerides, diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, 
and the like. Preparation of such liposomal formulations is within the level of skill in the 
5 art, as disclosed, for example, -in U.S. Patent Nos. 4,235,871 ; 4,501,728; 4,837,028; and 
4,737,323, all of which are incorporated herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 

10 patient has undergone. Ultimately, the attending physician will decide the amount of 
protein or other active ingredient of the present invention with which to treat each 
individual patient. Initially, the attending physician will administer low doses of protein 
or other active ingredient of the present invention and observe the patient's response. 
Larger doses of protein or other active ingredient of the present invention may be 

15 administered until the optimal therapeutic effect is obtained for the patient, and at that 
point the dosage is not increased further. It is contemplated that the various 
pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \ig to about 100 mg (preferably about 0.1 jig to about 10 mg, more 
preferably about 0. 1 |ig to about 1 mg) of protein or other active ingredient of the present 

20 invention per kg body weight. For compositions of the present invention which are 
useful for bone, cartilage, tendon or ligament regeneration, the therapeutic method 
includes administering the composition topically, systematically, or locally as an implant 
or device. When administered, the therapeutic composition for use in this invention is, of 
course, in a pyrogen-free, physiologically acceptable form. Further, the composition may 

25 desirably be encapsulated or injected in a viscous form for delivery to the site of bone, 
cartilage or tissue damage. Topical administration may be suitable for wound healing 
and tissue repair. Therapeutically useful agents other than a protein or other active 
ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 

30 sequentially with the composition in the methods of the invention. Preferably for bone 
and/or cartilage formation, the composition would include a matrix capable of delivering 
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the protein-containing or other active ingredient-containing composition to the site of 
bone and/or cartilage damage, providing a structure for the developing bone and cartilage 
and optimally capable of being resorbed into the body. Such matrices may be formed of 
materials presently in use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, 
mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential 
matrices for the compositions may be biodegradable and chemically defined calcium 
sulfate, tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and 
polyanhydrides. Other potential materials are biodegradable and biologically 
well-defined, such as bone or dermal collagen. Further matrices are comprised of pure 
proteins or extracellular matrix components. Other potential matrices are 
nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the 
above mentioned types of material, such as polylactic acid and hydroxyapatite or 
collagen and tricalcium phosphate. The bioceramics may be altered in composition, such 
as in calcium-aluminate-phosphate and processing to alter pore size, particle size, particle 
shape, and biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of 
lactic acid and glycolic acid in the form of porous particles having diameters ranging 
from 150 to 800 microns. In some applications, it will be useful to utilize a sequestering 
agent, such as carboxymethyl cellulose or autologous blood clot, to prevent the protein 
compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 
ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 
hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, poly(ethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and poly(vinyl alcohol). The amount of sequestering agent useful 
herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desorption of the protein from the polymer 
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matrix aijd to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein 
the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with 
5 other agents beneficial to the treatment of the bone and/or cartilage defect, wound, or 
tissue in question. These agents include various growth factors such as epidermal growth 
factor (EGF), platelet derived growth factor (PDGF), transforming growth factors 
(TGF-a and TGF-0), and insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary 

10 applications. Particularly domestic animals and thoroughbred horses, in addition to 

humans, are desired patients for such treatment with proteins or other active ingredients 
of the present invention. The dosage regimen of a protein-containing pharmaceutical 
composition to be used in tissue regeneration will be determined by the attending 
physician considering various factors which modify the action of the proteins, e.g. t 

15 amount of tissue weight desired to be formed, the site of damage, the condition of the 
damaged tissue, the size of a wound, type of damaged tissue (e.g., bone), the patient's 
age, sex, and diet, the severity of any infection, time of administration and other clinical 
factors. The dosage may vary with the type of matrix used in the reconstitution and with 
inclusion of other proteins in the pharmaceutical composition. For example, the addition 

20 of other known growth factors, such as IGF I (insulin like growth factor I), to the final 
composition, may also effect the dosage. Progress can be monitored by periodic 
assessment of tissue/bone growth and/or repair, for example, X-rays, histomorphometric 
determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 

25 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject Polynucleotides of the invention may also be administered by other 
known methods for introduction of nucleic acid into a cell or organism (including, 
without limitation, in the form of viral vectors or naked DNA). Cells may also be 
cultured ex vivo in the presence of proteins of the present invention in order to proliferate 

30 or to produce a desired effect on or activity in such cells. Treated cells can then be 
introduced in vivo for therapeutic purposes. 
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4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 
5 achieve its intended purpose._Morespecifically, a therapeutically effective amount 
means an amount effective to prevent development of or to alleviate the existing 
symptoms of the subject being treated Determination of the effective amount is well 
within the capability of those skilled in the art, especially in light of the detailed 
disclosure provided herein. For any compound used in the method of the invention, the 

10 therapeutically effective dose can be estimated initially from appropriate in vitro assays. 
For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that includes the IC50 as determined in cell culture (Le. , 

15 the concentration of the test compound which achieves a half-maximal inhibition of the 
protein's biological activity). Such information can be used to more accurately determine 
useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results 
in amelioration of symptoms or a prolongation of survival in a patient. Toxicity and 

20 therapeutic efficacy of such compounds can be determined by standard pharmaceutical 
procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the 
dose lethal to 50% of the population) and the ED 50 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio between LD50 and ED 50 . 

25 Compounds which exhibit high therapeutic indices are preferred. The data obtained from 
these cell culture assays and animal studies can be used in formulating a range of dosage 
for use in human. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED50 with little or no toxicity. Hie dosage 
may vary within this range depending upon the dosage form employed and the route of 

30 administration utilized. The exact formulation, route of administration and dosage can be 
chosen by the individual physician in view of the patient's condition. See, e.g. 9 Fingl et 
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al., 1975, in "The Phannacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount 

and interval may be adjusted individually to provide plasma levels of the active moiety 

which are sufficient to maintain the desired effects, or minimal effective concentration 

(MEC). The MEC will vary for each compound but can be estimated from in vitro data 
5 Dosages.necessaiy to_achieve the.MEC will depend.onindividual characteristics and 

route of administration. However, HPLC assays or bioassays can be used to determine 

plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should 

be administered using a regimen which maintains plasma levels above the MEC for 
10 10-90% of the time, preferably between 30-90% and most preferably between 50-90%. 

In cases of local administration or selective uptake, the effective local concentration of 

the drug may not be related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the 

invention will be in the range of about 0.01 pg/kg to 100 mg/kg of body weight daily, 
15 with the preferred dose being about 0.1 jxg/kg to 25 mg/kg of patient body weight daily, 

varying in adults and children. Dosing may be once daily, or equivalent doses may be 

delivered at longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the 

subject being treated, on the subject's age and weight, the severity of the affliction, the 
20 manner of administration and the judgment of the prescribing physician. 



412.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device 
which may contain one or more unit dosage forms containing the active ingredient Hie 
25 pack may, for example, comprise metal or plastic foil, such as a blister pack. The pack or 
dispenser device may be accompanied by instructions for administration. Compositions 
comprising a compound of the invention formulated in a compatible pharmaceutical 
carrier may also be prepared, placed in an appropriate container, and labeled for 
treatment of an indicated condition. 

30 

4.13 ANTIBODIES 
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Also included in the invention are antibodies to proteins, or fragments of proteins 
of the invention. The term "antibody" as used herein refers to immunoglobulin molecules 
and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules 
that contain an antigen-binding site that specifically binds (immunoreacts with) an 
5 antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, 
chimeric, single chain, F a b, F a t>' and fragments, and an Fab expression library. In 
general, an antibody molecule obtained from humans relates to any of the classes IgG, 
IgM, IgA, IgE and IgD, which differ from one another by the nature of the heavy chain 
present in the molecule. Certain classes have subclasses as well, such as IgGi, IgG2, and 

10 others. Furthermore, in humans, the light chain may be a kappa chain or a lambda chain. 
Reference herein to antibodies includes a reference to all such classes, subclasses and 
types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an 
antigen, or a portion or fragment thereof, and additionally can be used as an immunogen 

15 to generate antibodies that immunospecifically bind the antigen, using standard 

techniques for polyclonal and monoclonal antibody preparation. The full-length protein 
can be used or, alternatively, the invention provides antigenic peptide fragments of the 
antigen for use as immunogens. An antigenic peptide fragment comprises at least 6 
amino acid residues of the amino acid sequence of the full length protein, such as an 

20 amino acid sequence shown in SEQ ID NO: 1-438, and encompasses an epitope thereof 
such that an antibody raised against the peptide forms a specific immune complex with 
the full length protein or with any fragment that contains the epitope. Preferably, the 
antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid 
residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

25 epitopes encompassed by the antigenic peptide are regions of the protein that are located 
on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of alpha-2-macroglobulin-like protein that is located on the 
surface of the protein, e.g., a hydrophilic region. A hydrophobicity analysis of the human 

30 related protein sequence will indicate which regions of a related protein are particularly 
hydrophilic and, therefore, are likely to encode surface residues useful for targeting 
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antibody production. As a means for targeting antibody production, hydropathy plots 
showing regions of hydrophilicity and hydrophobicity may be generated by any method 
well known in the art, including, for example, the Kyte Doolittle or the Hopp Woods 
methods, either with or without Fourier transformation. See, e.g., Hopp and Woods, 
5 _ 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. MoL Biol 
157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

10 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (Le., able to 
distinguish the polypeptide of the invention from other similar polypeptides despite 

15 sequence identity, homology, or similarity found in the family of polypeptides), but may 
also interact with other proteins (for example, S. aureus protein A or other antibodies in 
ELESA techniques) through interactions with sequences outside the variable region of the 
antibodies, and in particular, in the constant region of the molecule. Screening assays to 
determine binding specificity of an antibody of the invention are well known and 

20 routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow 
et al. (Eds), Antibodies A Laboratory Manual; Cold Spring Harbor Laboratory; Cold 
Spring Harbor, NY (1988), Chapter 6. Antibodies that recognize and bind fragments of 
the polypeptides of the invention are also contemplated, provided that the antibodies are 
. first and foremost specific for, as defined above, full-length polypeptides of the 

25 invention. As with antibodies that are specific for full length polypeptides of the 
invention, antibodies of the invention that recognize fragments are those which can 
distinguish polypeptides from the same family of polypeptides despite inherent sequence 
identity, homology, or similarity found in the family of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 

30 modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
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invention. Kits comprising an antibody of the invention for any of the purposes 
described herein are also comprehended. In general, a kit of the invention also includes a 
control antigen for which the antibody is immunospecific. The invention further provides 
a hybridoma that produces an antibody according to the invention. Antibodies of the 
5 invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 
diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 

10 abnormal expression of the protein is involved. In the case of cancerous cells or 

leukemic cells, neutralizing monoclonal antibodies against the protein may be useful in 
detecting and preventing the metastatic spread of the cancerous cells, which may be 
mediated by the protein. 

The labeled antibodies of the present invention can be used for in vitro, in vivo, 

15 and in situ assays to identify cells or tissues in which a fragment of the polypeptide of 
interest is expressed. The antibodies may also be used directly in therapies or other 
diagnostics. The present invention further provides the above-described antibodies 
immobilized on a solid support Examples of such solid supports include plastics such as 
polycarbonate, complex carbohydrates such as agarose and Sepharose®, acrylic resins 

20 and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such 
solid supports ate well known in the art (Weir, D.M. et al., "Handbook of Experimental 
Immunology" 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 
(1986); Jacoby, WX>. et al., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The 
immobilized antibodies of the present invention can be used for in vitro, in vivo, and in 

25 situ assays as well as for immuno-affinity purification of the proteins of the present 
invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, 
30 Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor 
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Laboratory Press, Cold Spring Harbor, NY, incorporated herein by reference). Some of 
these antibodies are discussed below. 

4.13.1 POLYCLONAL ANTIBODIES 

5 _ Forthe production of polyclonal antibodies, various suitable host animals (e.g., 
rabbit, goat, mouse or other mammal) may be immunized by one or more injections with 
the native protein, a synthetic variant thereof, or a derivative of the foregoing. An 
appropriate immunogenic preparation can contain, for example, the naturally occurring 
immunogenic protein, a chemically synthesized polypeptide representing the 

10 immunogenic protein, or a recombinantly expressed immunogenic protein. Furthermore, 
the protein may be conjugated to a second protein known to be immunogenic in the 
mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and 
soybean trypsin inhibitor. The preparation can further include an adjuvant Various 

15 adjuvants used to increase the immunological response include, but are not limited to, 
Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface- 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

20 adjuvants that can be employed include MPLrTDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can 
be isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 

25 primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 

30 8 (April 17, 2000), pp. 25-28). 
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4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", 
as used herein, refers to a population of antibody molecules that contain only one 
molecular species of antibody molecule consisting of a unique light chain gene product 
5 and a unique heavy chain gene product. In particular, the complementarity determining 
regions (CDRs) of the monoclonal antibody are identical in all the molecules of the 
population. MAbs thus contain an antigen-binding site capable of immunoreacting with a 
particular epitope of the antigen characterized by a unique binding affinity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

10 described by Kohler and Mlstein, Nature. 256:495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing 
antibodies that will specifically bind to the immunizing agent Alternatively, the 
< lymphocytes can be immunized in vitro. 

15 The immunizing agent will typically include the protein antigen, a fragment 

thereof or a fusion protein thereof. Generally, either peripheral blood lymphocytes are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if 
non-human mammalian sources are desired The lymphocytes are then fused with an 
immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form 

20 a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice. Academic 
Press, (1986) pp. 59-103). Immortalized cell lines are usually transformed mammalian 
cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or 
mouse myeloma cell lines are employed. The hybridoma cells can be cultured in a 
suitable culture medium that preferably contains one or more substances that inhibit the 

25 growth or survival of the unfiised, immortalized cells. For example, if the parental cells 
lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), 
the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, 
and thymidine ("HAT medium"), which substances prevent the growth of HGPRT- 
deficient cells. 

30 Preferred immortalized cell lines are those that fuse efficiently, support stable 

high level expression of antibody by the selected antibody-producing cells, and are 
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sensitive to a medium such as HAT medium. More preferred immortalized cell lines are 
murine myeloma lines, which can be obtained, for instance, from the Salk Institute Cell 
Distribution Center, San Diego, California and the American Type Culture Collection, 
Manassas, Virginia. Human myeloma and mouse-human heteromyeloma cell lines also 
5 have been described for. the production of human monoclonal antibodies (Kozbor, J. 

Immunol, 133:3001 (1984); Brodeur et al., Monoclonal Antibody Production Techniques 
and Applications. Marcel Dekker, Inc., New York, (1987) pp. 5 1-63). 

The culture medium in which the hybridoma cells are cultured can then be 
assayed for the presence of monoclonal antibodies directed against the antigen. 

10 Preferably, the binding specificity of monoclonal antibodies produced by the hybridoma 
cells is determined by immunoprecipitation or by an in vitro binding assay, such as 
radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (BUS A). Such 
techniques and assays are known in the art. The binding affinity of the monoclonal 
antibody can, for example, be determined by the Scatchard analysis of Munson and 

15 Pollard, Anal. Biochem., 107:220 (1980). Preferably, antibodies having a high degree of 
specificity and a high binding affinity for the target antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for 
this purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMH640 

20 medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a 
mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified 
from the culture medium or ascites fluid by conventional immunoglobulin purification 
procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, 

25 gel electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such 
as those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal 
antibodies of the invention can be readily isolated and sequenced using conventional 
procedures (e.g., by using oligonucleotide probes that are capable of binding specifically 

30 to genes encoding the heavy and light chains of murine antibodies). The hybridoma cells 
of the invention serve as a preferred source of such DNA. Once isolated, the DNA can 
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be placed into expression vectors, which are then transfected into host cells such as 
simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not 
otherwise produce immunoglobulin protein, to obtain the synthesis of monoclonal 
antibodies in the recombinant host cells. The DNA also can be modified, for example, by 
5 substituting the coding sequence for human heavy and light chain constant domains in 
- place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 
368 , 812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all 
or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non- 
immunoglobulin polypeptide can be substituted for the constant domains of an antibody 
10 of the invention, or can be substituted for the variable domains of one antigen-combining 
site of an antibody of the invention to create a chimeric bivalent antibody. 

4,133 HUMANIZED ANTIBODIES 

The antibodies directed against the protein antigens of the invention can further 

15 comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , 
F(ab')2 or other antigen-binding subsequences of antibodies) that are principally 

20 comprised of the sequence of a human immunoglobulin, and contain minimal sequence 
derived from a non-human immunoglobulin. Humanization can be performed following 
the method of Winter and co-workers (Jones et al., Nature. 321 :522-525 (1986); 
Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., Science. 239:1534-1536 
(1988)), by substituting rodent CDRs or CDR sequences for the corresponding sequences 

25 of a human antibody. (See also U.S. Patent No. 5,225,539). In some instances, Fv 
framework residues of the human immunoglobulin are replaced by corresponding non- 
human residues. Humanized antibodies can also comprise residues that are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, 
the humanized antibody will comprise substantially all of at least one, and typically two, 

30 variable domains, in which all or substantially all of the CDR regions correspond to those 
of a non-human immunoglobulin and all or substantially all of the framework regions are 
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those of a human immunoglobulin consensus sequence. The humanized antibody 
optimally also will comprise at least a portion of an immunoglobulin constant region 
(Fc), typically that of a human immunoglobulin (Jones et al., 1986; Riechmann et al., 
1988; and Presta, Cuir. Op. Struct. Biol.. 2:593-596 (1992)). 

5 

4.13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the 
entire sequences of both the light chain and the heavy chain, including the CDRs, arise 
from human genes. Such antibodies are termed "human antibodies", or "fully human 

10 antibodies" herein. Human monoclonal antibodies can be prepared by the trioma 
technique; the human B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol 
Today 4: 72) and the EB V hybridoma technique to produce human monoclonal 
antibodies (see Cole, et al., 1985 In: Monoclonal Antibodies and Cancer Therapy, 
Alan R. liss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the 

15 practice of the present invention and may be produced by using human hybridomas (see 
Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by transforming human 
B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Iiss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

20 including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227:381 

(1991); Marks et al., J. Mol. Biol.. 222:581 (1991)). Similarly, human antibodies can be 
made by introducing human immunoglobulin loci into transgenic animals, e.g., mice in 
which the endogenous immunoglobulin genes have been partially or completely 
inactivated. Upon challenge, human antibody production is observed, which closely 

25 resembles that seen in humans in all respects, including gene rearrangement, assembly, 
and antibody repertoire. This approach is described, for example, in U.S. Patent Nos. 
5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. 
(Bio/Technologv 10, 779-783 (1992)); Lonberg et al. (Nature 368 856-859 (1994)); 
Morrison (Nature 368, 812-13 (1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 

30 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar 
(Intern. Rev. Immunol. 13 65-93 (19951V 
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Human antibodies may additionally be produced using transgenic nonhuman 
animals that are modified so as to produce fully human antibodies rather than the 
animal's endogenous antibodies in response to challenge by an antigen. (See PCT 
publication WO94/02602). The endogenous genes encoding the heavy and light 
5 immunoglobulin chains in the nonhuman host have been incapacitated, and active loci 
encoding human heavy and light chain immunoglobulins arc inserted into the host's 
genome. The human genes are incorporated, for example, using yeast artificial 
chromosomes containing the requisite human DNA segments. An animal which provides 
all the desired modifications is then obtained as progeny by crossbreeding intermediate 

10 transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the 
Xenomouse™ as disclosed in PCT publications WO 96/33735 and WO 96/34096. This 
animal produces B cells that secrete fully human immunoglobulins. Hie antibodies can 
be obtained directly from the animal after immunization with an immunogen of interest, 

15 as, for example, a preparation of a polyclonal antibody, or alternatively from 

immortalized B cells derived from the animal, such as hybridomas producing monoclonal 
antibodies. Additionally, the genes encoding the immunoglobulins with human variable 
regions can be recovered and expressed to obtain the antibodies directly, or can be further 
modified to obtain analogs of antibodies such as, for example, single chain Fv molecules. 

20 An example of a method of producing a nonhuman host, exemplified as a mouse, 

lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment 
genes from at least one endogenous heavy chain locus in an embryonic stem cell to 
prevent rearrangement of the locus and to prevent formation of a transcript of a 

25 rearranged immunoglobulin heavy chain locus, the deletion being effected by a targeting 
vector containing a gene encoding a selectable marker, and producing from the 
embryonic stem cell a transgenic mouse whose somatic and germ cells contain the gene 
encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is 

30 disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
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culture, introducing an expression vector containing a nucleotide sequence encoding a 
light chain into another mammalian host cell, and fusing the two cells to form a hybrid 
cell. The hybrid cell expresses an antibody containing the heavy chain and the light 
chain. 

5 In a further improvement on this procedure, a method for identifying a clinically 

relevant epitope on an immunogen, and a correlative method for selecting an antibody 
that binds immunospecifically to die relevant epitope with high affinity, are disclosed in 
PCT publication WO 99/53049. 

10 4.135 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

According to the invention, techniques can be adapted for the production of 
single-chain antibodies specific to an antigenic protein of the invention (see e.g«, U.S. 
Patent No. 4,946,778). In addition, methods can be adapted for the construction of Fab 
expression libraries (see e.g., Huse, et al., 1989 Science 246: 1275-1281) to allow rapid 

15 and effective identification of monoclonal fragments with the desired specificity for a 
protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that 
contain the idiotypes to a protein antigen may be produced by techniques known in the 
art including, but not limited to: (i) an F^^ fragment produced by pepsin digestion of an 
antibody molecule; (ii) an F a b fragment generated by reducing the disulfide bridges of an 

20 F( a tf)2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule 
with papain and a reducing agent and Civ) F v fragments. 

4.13.6 BISPECIFTC ANTIBODIES 

Bispecific antibodies arc monoclonal, preferably human or humanized, antibodies 
25 that have binding specificities for at least two different antigens. In the present case, one 
of the binding specificities is for an antigenic protein of the invention. The second 
binding target is any other antigen, and advantageously is a cell-surface protein or 
receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
30 recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have 
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different specificities (Milstein and Cmtto, Nature. 305:537-539 (1983)). Because of the 
random assortment of immunoglobulin heavy and light chains, these hybridomas 
(quadromas) produce a potential mixture of ten different antibody molecules, of which 
only one has the correct bispecific structure. The purification of the correct molecule is 
5 usually accomplished by affinity chromatography steps. Similar procedures are disclosed 
in WO 93/08829, published 13 May 1993, and in Traunecker et al. 9 1991 EMBO /., 
10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody- 
antigen combining sites) can be fused to immunoglobulin constant domain sequences. 

10 The fusion preferably is with an immunoglobulin heavy-chain constant domain, 

comprising at least part of the hinge, CH2, and CH3 regions. It is preferred to have the 
first heavy-chain constant region (CHI) containing the site necessary for light-chain 
binding present in at least one of the fusions. DNAs encoding the immunoglobulin 
heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted into 

15 separate expression vectors, and are co-transfected into a suitable host organism. For 
further details of generating bispecific antibodies see, for example, Suresh et al„ Methods 
inEnzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between 
a pair of antibody molecules can be engineered to maximize the percentage of 

20 heterodimers that are recovered from recombinant cell culture. Hie preferred interface 
comprises at least a part of the CH3 region of an antibody constant domain. In this 
method, one or more small amino acid side chains from the interface of the first antibody 
molecule are replaced with larger side chains (e.g. tyrosine or tryptophan). 
Compensatory "cavities" of identical or similar size to the large side chain(s) are created 

25 on the interface of the second antibody molecule by replacing large amino acid side 
chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as 
homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody 
30 fragments (e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific 

antibodies from antibody fragments have been described in the literature. For example, 
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bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 
229:81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved 
to generate F(ab*>2 fragments. These fragments are reduced in the presence of the dithiol 
complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intermolecular 
5 disulfide formation. The Fab' fragments generated are then converted to 

thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB derivatives is then 
reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is mixed with an 
equimolar amount of the other Fab'-TNB derivative to form the bispecific antibody. The 
bispecific antibodies produced can be used as agents for the selective immobilization of 
10 enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and 
chemically coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 
175:217-225 (1992) describe the production of a fully humanized bispecific antibody 
F(ab')2 molecule. Each Fab' fragment was separately secreted from E. coli and subjected 

15 to directed chemical coupling in vitro to form the bispecific antibody. The bispecific 
antibody thus formed was able to bind to cells overexpressing the ErbB2 receptor and 
normal human T cells, as well as trigger the lytic activity of human cytotoxic 
lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments 

20 directly from recombinant cell culture have also been described. For example, bispecific 
antibodies have been produced using leucine zippers. Kostelny et al., J. Immunol. 
148(5): 1547-1553 (1992). Hie leucine zipper peptides from the Fos and Jun proteins 
were linked to the Fab* portions of two different antibodies by gene fusion. The antibody 
homodimers were reduced at the hinge region to form monomers and then re-oxidized to 

25 form the antibody heterodimers. This method can also be utilized for the production of 
antibody homodimers. The "diabody" technology described by Hollinger et aL, Proc. 
Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an alternative mechanism for 
making bispecific antibody fragments. The fragments comprise a heavy-chain variable 
domain (Vh) connected to a light-chain variable domain (Vl) by a linker which is too 

30 short to allow pairing between the two domains on the same chain. Accordingly, the Vh 
and Vl domains of one fragment are forced to pair with the complementary V L and V H 
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domains of another fragment, thereby forming two antigen-binding sites. Another 
strategy for making bispecific antibody fragments by the use of single-chain Fv (sFv) 
dimers has also been reported See, Gruber et al., J. Immunol. 152:5368 (1994). 
Antibodies with more than two valencies are contemplated. For example, 
5 trispecific.antiboches can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic 
arm of an immunoglobulin molecule can be combined with an arm which binds to a 
triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, 

10 CD28, or B7), or Fc receptors for IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and 
FcyRHI (CD16) so as to focus cellular defense mechanisms to the cell expressing the 
particular antigen. Bispecific antibodies can also be used to direct cytotoxic agents to 
cells which express a particular antigen. These antibodies possess an antigen-binding 
arm and an arm which binds a cytotoxic agent or a radionuclide chelator, such as 

15 EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest binds the 
protein antigen described herein and further binds tissue factor (IF). 

4.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 

20 Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted 
cells (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; 
WO 92/200373; EP 03089). It is contemplated that the antibodies can be prepared in 
vitro using known methods in synthetic protein chemistry, including those involving 

25 crosslinking agents. For example, immunotoxins can be constructed using a disulfide 
exchange reaction or by forming a thioether bond. Examples of suitable reagents for this 
purpose include iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, 
for example, in U.S. Patent No. 4,676,980. 

30 4.13.8 EFFECTOR FUNCTION ENGINEERING 



102 



WO 02/081731 PCT/US02/01222 



It can be desirable to modify the antibody of the invention with respect to effector 
function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
5 generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron 
et al, J. Exp Med., 176: 1191-1195 (1992) and Shopes, J. Immunol., 148: 2918-2922 
(1992). Homodimeric antibodies with enhanced anti-tumor activity can also be prepared 
using heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53: 
10 2560-2565 (1993). Alternatively, an antibody can be engineered that has dual Fc 

regions and can thereby have enhanced complement lysis and ADCC capabilities. See 
Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

4.13.9 IMMUNOCONJUGATES 

15 The invention also pertains to immunoconjugates comprising an antibody 

conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an 
enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments 
thereof), or a radioactive isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 

20 been described above. Enzymatically active toxins and fragments thereof that can be 
used include diphtheria A chain, nonbinding active fragments of diphtheria toxin, 
exotoxin A chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, 
modeccin A chain, alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca 
americana proteins (PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, 

25 crotin, sapaonaria officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, 
enomycin, and the tricothecenes. A variety of radionuclides are available for the 
production of radioconjugated antibodies. Examples include 212 Bi, l31 1, 13l Ih, 90 Y, and 
186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
30 bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyIdithiol) 

propionate (SPDP), iminothiolane (TT), bifunctional derivatives of imidoesters (such as 
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dimethyl adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes 
(such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) 
hexanediamine), bis-diazonium derivatives (such as bis-(p-diazoniumbenzoyl)- 
ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), and bis-active 
5 - fluorine compounds (such as_l,5-difluoro-2,4-dinitrobenzene). For example, a ricin 
immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
. Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid 
(MX-DTP A) is an exemplary chelating agent for conjugation of radionucleotide to the 
antibody. See W094/1 1026. 
10 In another embodiment, the antibody can be conjugated to a "receptor* 1 (such 

streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate 
is administered to the patient, followed by removal of unbound conjugate from the 
circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that 
is in turn conjugated to a cytotoxic agent. 

15 



4.14 COMPUTER READABLE SEQUENCES 
In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 

20 readable media" refers to any medium which can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such as 
floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as 
CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
categories such as magnetic/optical storage media. A skilled artisan can readily 

25 appreciate how any of the presently known computer readable mediums can be used to 
create a manufacture comprising computer readable medium having recorded thereon a 
nucleotide sequence of the present invention. As used herein, "recorded" refers to a 
process for storing information on computer readable medium. A skilled artisan can 
readily adopt any of the presently known methods for recording information on computer 

30 readable medium to generate manufactures comprising the nucleotide sequence 
information of the present invention. 
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A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 
chosen to access the stored information. In addition, a variety of data processor programs 
5 and formats can be used to store the.nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented 
in a word processing text file, formatted in commercially-available software such as 
WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a 
database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can 

10 readily adapt any number of data processor structuring formats (e.g. text file or database) 
in order to obtain computer readable medium having recorded thereon the nucleotide 
sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NOs: 1 - 438 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of 

15 the nucleotide sequences of SEQ ID NOs: 1 - 438 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 
demonstrate how software which implements the BLAST (Altschul et al„ J. Mol. Biol. 

20 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) 
search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may 
be useful in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

25 As used herein, M a computer-based system" refers to the hardware means, 

software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing unit 
(CPU), input means, output means, and data storage means. A skilled artisan can readily 

30 appreciate that any one of the currently available computer-based systems are suitable for 
use in the present invention. As stated above, the computer-based systems of the present 
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invention comprise a data storage means having stored therein a nucleotide sequence of 
the present invention and the necessary hardware means and software means for 
supporting and implementing a search means. As used herein, "data storage means" 
refers to memory which can store nucleotide sequence information of the present 
5 __invention,_or_a memory access means which can access manufactures having recorded 
thereon the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 

10 Search means are used to identify fragments or regions of a known sequence which 
match a particular target sequence or target motif. A variety of known algorithms are 
disclosed publicly and a variety of commercially available software for conducting search 
means are and can be used in the computer-based systems of the present invention. 
Examples of such software includes, but is not limited to, Smith-Waterman, MacPattem 

15 (EMBL), BLASTN and BLASTA (NPOLYPEP1TDEIA). A skilled artisan can readily 
recognize that any one of the available algorithms or implementing software packages for 
conducting homology searches can be adapted for use in the present computer-based 
systems. As used herein, a "target sequence" can be any nucleic acid or amino acid 
sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 

20 readily recognize that the longer a target sequence is, the less likely a target sequence will 
be present as a random occurrence in the database. The most preferred sequence length 
of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 
to 100 nucleotide residues. However, it is well recognized that searches for 
commercially important fragments, such as sequence fragments involved in gene 

25 expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) are 
chosen based on a three-dimensional configuration which is formed upon the folding of 
the target motif. There are a variety of target motifs known in the art. Protein target 

30 motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic 
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acid target motifs include, but are not limited to, promoter sequences, hairpin structures 
and inducible expression elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be 

used to control gene expression through triple helix formation or antisense DNA or RNA, 
both of which methods are based on the binding of a polynucleotide sequence to DNA or 
RNA, Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in 
length and are designed to be complementary to a region of the gene involved in 

10 transcription (triple helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et aL, 
Science 15241:456 (1988); andDervan et al., Science 251:1360 (1991)) or to the mRNA 
itself (antisense - Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as 
Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, PL (1988)). Triple 
helix-formation optimally results in a shut-off of RNA transcription from DNA, while 

15 antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. 
Both techniques have been demonstrated to be effective in model systems. Information 
contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

20 4.16 DIAGNOSTIC ASSAYS AND KITS 

Hie present invention further provides methods to identify the presence or 
expression of one of the ORFs of the present invention, or homolog thereof, in a test 
sample, using a nucleic acid probe or antibodies of the present invention, optionally 
conjugated or otherwise associated with a suitable label. 

25 In general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention 
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under such conditions, and amplifying annealed polynucleotides, so that if a 
polynucleotide is amplified, a polynucleotide of the invention is detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
5 _ polypeptide.for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

10 Conditions for incubating a nucleic acid probe or antibody with a test sample 

vary. Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the nucleic acid probe or antibody used in 
the assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted to 

15 employ the nucleic acid probes or antibodies of the present invention. Examples of such 
assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, 
G.R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, EL Vol. 1 
(1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of immunoassays: 

20 Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science 
Publishers, Amsterdam, The Netherlands (1985). The test samples of the present 
invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described 
method will vary based on the assay format, nature of the detection method and the 

25 tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein 
extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain 
the necessary reagents to cany out the assays of the present invention. Specifically, the 

30 invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or 
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antibodies of the present invention; and (b) one or more other containers comprising one 
or more of the following: wash reagents, reagents capable of detecting presence of a 
bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in 
5 separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 

10 container which will accept the test sample, a container which contains the antibodies 
used in the assay, containers which contain wash reagents (such as phosphate buffered 
saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the 
bound antibody or probe. Types of detection reagents include labeled nucleic acid probes, 
labeled secondary antibodies, or in the alternative, if the primary antibody is labeled, the 

15 enzymatic, or antibody binding reagents which are capable of reacting with the labeled 
antibody. One skilled in the art will readily recognize that the disclosed probes and 
antibodies of the present invention can be readily incorporated intp one of the established 
kit formats which are well known in the art. 

20 4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in 
medical imaging of sites expressing the molecules of the invention (e.g., where the 
polypeptide of the invention is involved in the immune response, for imaging sites of 
inflammation or infection). See, e.g., Kunkel et aL, U.S. Pat NO. 5,413,778. Such 
25 methods involve chemical attachment of a labeling or imaging agent, administration of 
the labeled polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging 
the labeled polypeptide in vivo at the target site. 

4.18 SCREENING ASSAYS 
30 Using the isolated proteins and polynucleotides of the invention, the present 

invention further provides methods of obtaining and identifying agents which bind to a 
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polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 438, or bind to a specific domain of the polypeptide encoded by 
the nucleic acid In detail, said method comprises the steps of : 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
5 ... present invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid, 
hi general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a 
polynucleotide of the invention for a time sufficient to form a polynucleotide/compound 

10 complex, and detecting the complex, so that if a polynucleotide/compound complex is 
detected, a compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind 
to a polypeptide of the invention can comprise contacting a compound with a polypeptide 
of the invention for a time sufficient to form a polypeptide/compound complex, and 

15 detecting the complex, so that if a polypeptide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention 
can also comprise contacting a compound with a polypeptide of the invention in a cell for 
a time sufficient to form a polypeptide/compound complex, wherein the complex drives 

20 expression of a receptor gene sequence in the cell, and detecting the complex by 

detecting reporter gene sequence expression, so that if a polypeptide/compound complex 
is detected, a compound that binds a polypeptide of the invention is identified 

Compounds identified via such methods can include compounds which modulate 
the activity of a polypeptide of the invention (that is, increase or decrease its activity, 

25 relative to activity observed in the absence of the compound). Alternatively, compounds 
identified via such methods can include compounds which modulate the expression of a 
polynucleotide of the invention (that is, increase or decrease expression relative to 
expression levels observed in the absence of the compound). Compounds, such as 
compounds identified via the methods of the invention, can be tested using standard 

30 assays well known to those of skill in the art for their ability to modulate 
activity/expression, 
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The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

5 For random screening, agents such as peptides, carbohydrates, pharmaceutical 

agents and the like are selected at random and are assayed for their ability to bind to the 
protein encoded by the ORF of the present invention. Alternatively, agents may be 
rationally selected or designed. As used herein, an agent is said to be "rationally selected 
or designed" when the agent is chosen based on the configuration of the particular 

10 protein. For example, one skilled in the art can readily adapt currently available 

procedures to generate peptides, pharmaceutical agents and the like, capable of binding to 
a specific peptide sequence, in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al., Application of Synthetic Peptides: Antisense Peptides," In 
Synthetic Peptides, A User's Guide, W.BL Freeman, NY (1992), pp. 289-307, and 

15 Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 
In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one of the 
ORFs or EMFs of the present invention. As described above, such agents can be 
randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a 

20 skilled artisan to design sequence specific or element specific agents, modulating the 
expression of either a single ORF or multiple ORFs which rely on the same EMF for 
expression control. One class of DNA binding agents are agents which contain base 
residues which hybridize or form a triple helix formation by binding to DNA or RNA. 
Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or 

25 can be a variety of sulfhydryl or polymeric derivatives which have base attachment 
capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple 
helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 
30 (1988); and Dervan et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - 
Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
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Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 
blocks translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the sequences 

5 of tiie present invention is necessary for the design of an antisense or triple helix 

oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present 
invention can be used as a diagnostic agent. Agents which bind to a protein encoded by 
one of the ORFs of the present invention can be formulated using known techniques to 
10 generate a pharmaceutical composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific 
nucleic acid hybridization probes capable of hybridizing with naturally occurring 

15 nucleotide sequences. The hybridization probes of the subject invention may be derived 
from any of the nucleotide sequences SEQ ID NOs: 1 - 438. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
of any of the nucleotide sequences SEQ ID NOs: 1 - 438 can be used as an indicator of 
the presence of RNA of cell type of such a tissue in a sample. 

20 Any suitable hybridization technique can be employed, such as, for example, in 

situ hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 
provides additional uses for oligonucleotides based upon the nucleotide sequences. Such 
probes used in PCR may be of recombinant origin, may be chemically synthesized, or a 
mixture of both. The probe will comprise a discrete nucleotide sequence for the detection 

25 of identical sequences or a degenerate pool of possible sequences for identification of 
closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include 
the cloning of nucleic acid sequences into vectors for the production of mRNA probes. 
Such vectors are known in the art and are commercially available and may be used to 

30 synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerase as T7 or SP6 RNA polymerase and the appropriate radioactively labeled 
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nucleotides. The nucleotide sequences may be used to construct hybridization probes for 
mapping their respective genomic sequences. The nucleotide sequence provided herein 
may be mapped to a chromosome or specific regions of a chromosome using well known 
genetic and/or chromosomal mapping techniques. These techniques include in situ 
hybridization, linkage analysis against known chromosomal markers, hybridization 
screening with libraries or flow-sorted chromosomal preparations specific to known 
chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) 
Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265:1981f). Correlation between the location of a nucleic acid on a physical 
chromosomal map and a specific disease (or predisposition to a specific disease) may 
help delimit the region of DNA associated with that genetic disease. Hie nucleotide 
sequences of the subject invention may be used to detect differences in gene sequences 
between normal, carrier or affected individuals. 

4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly 
practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to 
those of skill in the art using any suitable support such as glass, polystyrene or Teflon. One 
strategy is to precisely spot oligonucleotides synthesized by standard synthesizers. 
Immobilization can be achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. 
Microbiol. 28(6) 1469-72); using UV light (Nagata et dL $ 1985; Dahlen et aL, 1987; 
Morrissey & Collins, (1989) Mol. Cell Probes 3(2) 189-207) or by covalent binding of base 
modified DNA (Keller et dL, 1988; 1989); all references being specifically incorporated 
herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al. (1994) Proc. Natl. Acad. Sci. USA 91(8) 
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3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 
purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 
5 such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be 
used Nunc Laboratories have developed a method by which DNA can be covalendy bound 
to the microwell surface termed Covalink NHL Covalink NH is a polystyrene surface 
grafted with secondary amino groups (>NH) that serve as bridge-heads for further covalent 
10 coupling. Covalink Modules may be purchased from Nunc Laboratories. DNA molecules 
may be bound to Covalink exclusively at the 5'-end by a phosphoramidate bond, allowing 
immobilization of more than 1 pmol of DNA (Rasmussen et cd. t (1991) Anal. Biochemu 
198(1)138-42). 

The use of Covalink NH strips for covalent binding of DNA molecules at the 5'-end 
15 has been described (Rasmussen et aL, (1991). In this technology, a phosphoramidate bond 
is employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is beneficial as 
immobilization using only a single covalent bond is preferred The phosphoramidate bond 
joins the DNA to the Covalink NH secondary amino groups that are positioned at the end 
of spacer arms covalently grafted onto the polystyrene surface through a 2 nm long spacer 
20 arm. To link an oligonucleotide to Covalink NH via an phosphoramidate bond, the 

oligonucleotide terminus must have a 5'-end phosphate group. It is, perhaps, even possible 
for biotin to be covalendy bound to Covalink and then stteptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) 
and denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 
25 1-methylimidazole, pH 7.0 (l-Melmy), is then added to a final concentration of 10 mM 
l-Melmy. A ss DNA solution is then dispensed into Covalink NH strips (75 ul/well) 
standing on ice. 

Carbodiimide 0.2 M 1 -ethyl-3-(3-dimethylaminopiopyl)^aibodiimide (EDC), 
dissolved in 10 mM l-Melm?, is made fresh and 25 ui added per well. Hie strips are 
30 incubated for 5 hours at 50°C, After incubation the strips are washed using, e.g., 

Nunc-Immuno Wash; first the wells are washed 3 times, then they are soaked with washing 
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solution for 5 min., and finally they are washed 3 times (where in the washing solution is 0.4 
N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 
5 herein by reference. This method of preparing an oligonucleotide bound to a support 
involves attaching a nucleoside 3 -reagent through the phosphate group by a covalent 
phosphodiester link to aliphatic hydroxyl groups carried by the support. The 
oligonucleotide is then synthesized on the supported nucleoside and protecting groups 
removed from the synthetic oligonucleotide chain under standard conditions that do not 

10 cleave the oligonucleotide from the support Suitable reagents include nucleoside 
phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA 
probe arrays may be employed. For example, addressable laser-activated photodeprotection 
may be employed in the chemical synthesis of oligonucleotides direcdy on a glass surface, 

15 as described by Fodor et al (1991) Science 251(4995) 767-73, incorporated herein by 

reference. Probes may also be immobilized on nylon supports as described by Van Ness et 
al. (1991) Nucleic Acids Res. 19(12) 3345-50; or linked to Teflon using the method of 
Duncan & Cavalier (1988) Anal. Biochem. 169(1) 104-8; all references being specifically 
incorporated herein. 

20 To link an oligonucleotide to a nylon support, as described by Van Ness et al. 

(1991), requires activation of the nylon surface via alkylation and selective activation of the 
5-amine of oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al., (1994) PNAS USA 91(1 1) 5022-6, 

25 incorporated herein by reference). These authors used current photolithographic techniques 
to generate arrays of immobilized oligonucleotide probes PNA chips). These methods, in 
which light is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile 5-protected Af-acyl-deoxynucleoside 
phosphoramidites, surface linker chemistry and versatile combinatorial synthesis strategies. 

30 A matrix of 256 spatially defined oligonucleotide probes may be generated in this manner. 



115 



WO 02/081731 



PCTYUS02/01222 



4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

Hie nucleic acids may be obtained from any appropriate source, such as cDNAs, 
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 
inserts, and RNA, including mRNA without any amplification steps. For example, 

5 Sambrook et al. (1989) describes three protocols for the isolation of high molecular weight 
DNA from mammalian cells (p. 9.14-9.23). 

DNA fragments may be prepared as clones in M13, plasmid or lambda vectors 
and/or prepared direcdy from genomic DNA or cDNA by PCR or other amplification 
methods. Samples may be prepared or dispensed in multiwell plates. About 1GO-1000 ng of 

10 DNA samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those 
of skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 
of Sambrook et al. (1989), shearing by ultrasound and NaOH treatment 

Low pressure shearing is also appropriate, as described by Schriefer et aL (1990) 

15 Nucleic Acids Res. 18(24) 7455-6, incorporated herein by reference). In this method, DNA 
samples are passed through a small French pressure cell at a variety of low to intermediate 
pressures. A lever device allows controlled application of low to intermediate pressures to 
the cell. Hie results of these studies indicate that low-pressure shearing is a useful 
alternative to sonic and enzymatic DNA fragmentation methods. 

20 One particularly suitable way for fragmenting DNA is contemplated to be that using 

the two base recognition endonuclease, CWJI, described by Fitzgerald et al. (1992) Nucleic 
Acids Res. 20(14) 3753-62. These authors described an approach for the rapid 
fragmentation and fractionation of DNA into particular sizes that they contemplated to be 
suitable for shotgun cloning and sequencing. 

25 The restriction endonuclease CViJI normally cleaves the recognition sequence 

PuGCPy between the G and C to leave blunt ends. Atypical reaction conditions, which alter 
the specificity of this enzyme (CWJI**), yield a quasi-random distribution of DNA 
fragments form the small molecule pUC19 (2688 base pairs). Fitzgerald et aL (1992) 
quantitatively evaluated the randomness of this fragmentation strategy, using a CWJI** 

30 digest of pUC19 that was size fractionated by a rapid gel filtration method and directly 
ligated, without end repair, to a lac Z minus M13 cloning vector. Sequence analysis of 76 
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clones showed that CviJI** restricts pyGCPy and PuGCPu, in addition to PuGCPy sites, and 
that new sequence data is accumulated at a rate consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead 
5 of 2-5 ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments axe obtained or 
prepared, it is important to denature the DNA to give single stranded pieces available for 
hybridization. This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. 
10 The solution is then cooled quickly to 2°C to prevent renaturation of the DNA fragments 
before they are contacted with the chip. Phosphate groups must also be removed from 
genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 

1 5 membrane. Spotting may be performed by using arrays of metal pins (the positions of which 
correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of 
a DNA solution to a nylon membrane. By offset printing, a density of dots higher than the 
density of the wells is achieved. One to 25 dots may be accommodated in 1 mm 2 , 
depending on the type of label used By avoiding spotting in some preselected number of 

20 rows and columns, separate subsets (subarrays) may be formed. Samples in one subarray 
may be the same genomic segment of DNA (or the same gene) from different individuals, or 
may be different, overlapped genomic clones. Each of fee subarrays may represent replica 
spotting of the same samples. In one example, a selected gene segment may be amplified 
from 64 patients. For each patient, the amplified gene segment may be in one 96-well plate 

25 (all 96 wells containing the same sample). A plate for each of the 64 patients is prepared By 
using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. Subarrays 
may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dotspanmaybe 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, 

30 Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell 
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plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by 
exposure to flat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration 
of the present disclosure, one of skill in the art will appreciate that many other embodiments 

5 and variations may be made in the scope of the present invention. Accordingly, it is 

intended that the broader aspects of the present invention not be limited to the disclosure of 
the following examples. The present invention is not to be limited in scope by the 
exemplified embodiments which are intended as illustrations of single aspects of the 
invention, and compositions and methods which are functionally equivalent are within the 

10 scope of the invention. Indeed, numerous modifications and variations in the practice of the 
invention are expected to occur to those skilled in the art upon consideration of the present 
preferred embodiments. Consequently, the only limitations which should be placed upon 
the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby 

15 incorporated by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from 
20 various human tissues and in some cases isolated from a genomic library derived from 
human chromosome using standard PCR, SBH sequence signature analysis and Sanger 
sequencing techniques. Hie inserts of the library were amplified with PCR using primers 
specific for the vector sequences which flank the inserts. Clones from cDNA libraries were 
spotted on nylon membrane filters and screened with oligonucleotide probes (e.g., 7-mers) 
25 to obtain signature sequences. The clones were clustered into groups of similar or identical 
sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a 
typical Sanger sequencing protocol. PCR products were purified and subjected to 
fluorescent dye terminator cycle sequencing. Single pass gel sequencing was done using a 
30 377 Applied Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences. In 
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some cases RACE (Random Amplification of cDNA Bids) was performed to further extend 
the sequence in the 5' direction. 

5.2 EXAMPLE 2 
Novel Nucleic Acids 

5 The novel nucleic acids of the present invention of the invention were assembled 

from sequences that were obtained from a cDNA library by methods described in Example 
1 above, and in some cases sequences obtained from one or more public databases. The 
nucleic acids were assembled using an EST sequence as a seed. Then a recursive algorithm 
was used to extend the seed EST into an extended assemblage, by pulling additional 

10 sequences from different databases (i.e., Hyseq's database containing EST sequences, 
dbEST version 119, gb pri 119, and UniGene version 119) that belong to this assemblage. 
The algorithm terminated when there was no additional sequences from the above databases 
that would extend the assemblage. Inclusion of component sequences into the assemblage 
was based on a BLASTN hit to the extending assemblage with BLAST score greater than 

15 300 and percent identity greater than 95%. 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any 
frame shifts and incorrect stop codons were corrected by hand editing. During editing, the 
sequence was checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 

20 120, gb pri 120, UniGene version 120, Genpept release 120). Other computer programs 
which may have been used in the editing process were phnedPhrap and Consed (University 
of Washington) and ed-ready, ed-ext and cg-zip-2 (Hyseq, Inc.). The full-length nucleotide 
and amino acid sequences, including splice variants resulting from these procedures arc 
shown in the Sequence Listing as SEQ ID NOS: 1- 438. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-438. 

The nearest neighbor results for polypeptides encoded by SEQ ID NO: 1-438 
were obtained by a BLASTP (version 2.0al 19MP-WashU) search against Genpept, 
Geneseq and SwissProt databases using BLAST algorithm. The nearest neighbor result 
showed the closest homologue with functional annotation for SEQ ID NO: 1-438. Hie 

30 translated amino acid sequences for which the nucleic acid sequence encodes are shown 
in the Sequence listing. The homologues with identifiable functions for SEQ ID NO: 1- 
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438 are shown in Table 2 below.Using eMatrix software package (Stanford University, 
Stanford, CA) (Wu et al., J. Comp. Biol., Vol. 6 pp. 219-235 (1999) herein incorporated 
by reference), all the sequences were examined to determine whether they had 
identifiable signature regions. Table 3 shows the signature region found in the indicated 

5 polypeptide sequences, the description of the signature, the eMatrix p-value(s) and the 
pqsition(s) of the signature within the polypeptide sequence. 

Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) polypeptides encoded by 
SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438) were examined for domains with homology 

10 to certain peptide domains. Table 4 shows the name of the domain found, the 

description, the product of all the e-value of similar domains found, the pFam score for 
the identified domain within the sequence, number of similar domains found, and the 
position of the domain in the SEQ ID NO: being intexrorgated 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San 

15 Diego, CA) was used to predict the three-dimensional structure models for the 

polypeptides encoded by SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438). Models were 
generated by (1) PSI-BLAST which is a multiple alignment sequence profile-based 
searching developed by Altschul et al, (Nucl. Acids, Res. 25, 3389-3408 (1997)), (2) 
ffigh Throughput Modeling (HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) 

20 which is an automated sequence and structure searching procedure 

rtittp://www.msi.comA . and (3) SeqFold 1 * 4 which is a fold recognition method described 
by Fischer and Eisenberg (J. Mol. BioL 209, 779-791 (1998)). This analysis was carried 
out, in part, by comparing the polypeptides of the invention with the known NMR 
(nuclear magnetic resonance) and x-ray crystal three-dimensional structures as templates. 

25 Table 5 shows, "PDB ID", the Protein DataBase (PDB) identifier given to template 
structure; "Chain ID", identifier of the subcomponent of the PDB template structure; 
"Compound Information", information of the PDB template structure and/or its 
subcomponents; "PDB Function Annotation" gives function of the PDB template as 
annotated by the PDB files (http:/www.rcsb.org/PDB/) : start and end amino acid position 

30 of the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, 
and the Potentials) of Mean Force (PMF). The verify score is produced by GeneAtlas™ 
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software (MSI), is based on Dr. Eisenberg's Profile-3D threading program developed in 
Dr. David Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and 
Eisenberg, Nature, 356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc. 
Natl. Acad. Sci. USA, 95:13597-12502. The verify score produced by GeneAtlas 
5 normalizes the verify score for proteins with different lengths so that a unified cutoff can 
be used to select good models as follows: 

Verify score (normalized) = (raw score - 1/2 high scone)/(l/2 high score) 

10 The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring 

function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potentials (MFP). As 
given in Table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good 
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 

15 model, A SeqFold™ score of more than 50 is considered significant. A good model may 
also be determined by one of skill in the art based all the information in Table 5 taken in 
totality. 

The nucleotide sequence within the sequences that codes for signal peptide 
sequences and their cleavage sites can be determined from using Neural Network SignalP 

20 Vl.l program (from Center for Biological Sequence Analysis, The Technical University 
of Denmark). The process for identifying prokaryotic and eukaryotic signal peptides and 
their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, Soren 
Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and 
eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, 

25 Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and 
a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 6 shows the position of the signal peptide in each of the 
polypeptides and the maximum score and mean score associated with that signal peptide. 
Table 7 correlates each of SEQ ID NO: 1-438 to a specific chromosomal location. 

30 Table 8 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 

1^38, novel polypeptide sequences SEQ ID NO: 1-438, and their corresponding priority 
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nucleotide sequences in the priority application USSN 09/774,528, herein incorporated 
by reference in its entirety. 
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Table 1 



Tissue Origin 



RNA/Tissue 
Source 



Library 
name 



SEQ ZD NO: 



adult brain 



GIBCO 



AB3001 



76-77 91 106-107 115 134 163-164 178 203 
232 255 265 276 279 322-323 



adult brain 



GIBCO 



ABD003 



16 19 24 77 
110 116 121- 
142-143 151 
193 196 198 
220 223 229 
259 262 265 
317 321 324- 
371 391-392 



80-81 85 89- 
123 125 130- 
153 158-159 
200 208-209 
232-234 236 
267 274-276 
325 327 .337- 
400 



90 92 96 98 105 
132 134-136 138 
163-164 184 191 
213-214 216 219- 
239 241 243 257- 
278 284 292 302 
338 340 348 359 



adult brain 



Clontech 



ABR001 



1 18-19 35 80 98 125 136 153 185 200 209 
221 228-229 239 243 274-275 302 399-400 



adult brain 



Clontech 



ABR0065 



7-8 18 32 35 52 57 85 91 96 111 113 126 

131 135 138-139 142 148 153-154 181 188 

192 199 209-211 217 221 224 226 229 233 

235 238 243 248 273 283-284 286 292 316 

322 348 357 361 367 376 378 399 407 409 
417 428 



adult brain 



Clontech 



ABR008 



2 4 6-11 19- 
72-73 76 80- 
109 111-112 
135 138-139 
159 168-172 
189-190 194 
219 221-222 
243-244 248 
276 281-282 
304 315-317 
332 341 352- 
376-377 379- 
394 396-402 
433 



21 23-25 31 
81 85 88-90 
114-119 121- 
144 146-150 
174-175 178 
196 198-201 
224 229-230 
253-256 260- 
286-289 291- 
319 321-322 
357 360 362 
380 383-384 
407-410 412- 



35-37 39-41 45-46 
94-95 97 102-105 
122 126-131 134- 
152-153 156-157 
180 182 185-186 
203 205-210 217 
232-233 236-239 
261 263-265 273 
292 299-300 302 
324 326 329 331- 
365 367-368 370 
387-389 391-392 
413 419 425-426 



adult brain 



Clontech 



ABR011 



85 90 



adult brain 



BioChain 



ABR012 



148 213 



adult brain 



BioChain 



ABR013 



85 322 



adult brain 



Invitrogen 



ABR014 



9 23 85 146 200 233 282 321 330 



adult brain 



Invitrogen 



ABR015 



14 31 69 121 124 163 209 216 224 291 377 



adult brain 



Invitrogen 



ABR016 



92 136 219 279 



adult brain 



Invitrogen 



ABT004 



2 7-8 20-21 33 85 90 
121 123 129-131 138- 
157-158 172 178 180 
230 232 234 239 308 
373 375 401 412 



-91 95 97 102-103 108 
139 143 146 151 153 
209-210 213 219 229- 
321 330 360 365 370- 



adipocytes 



Stratagene 



ADP001 



3-4 23 36 79 81 106- 
147 151 154 158 179 
256-257 287 292 297 



107 116 129 
181 192 196 
313 329 359 



133-134 
222 230 



adrenal gland 



Clontech 



ADR002 



2 25 27 
114 121- 
180 182 
244 246 
329 336 



33 57 76 85- 
122 125 129- 
198-199 201 
253-254 257 
352 403 



86 88 96 98 
130 134 147 
205 207-208 
261 276 280 



105-108 
164 178 
240-241 
292 320 



adult heart 



GIBCO 



AHR001 



3 17-21 
105-110 
139 141 
182 186 
213 215 



27 32 74 76 
117 121 124 
148 151-153 
190 193 198 
222 



85 89-91 95-96 102-103 
-125 128 131 134-136 
155-156 161 163 181- 
200-201 205 207 211- 
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Table 1 



Tissue Origin 


RNA/Tissue 


Library 


SEQ 


ID NO: 


















Source 


Name 




























225 


229- 


■230 


234 


251- 


254 


257< 


-259 


263 


274- 








277 


280 


292- 


-297 


301 


303- 


-304 


315- 


-316 


319 








329 


-331 


345 


359 


384 


417 


423 


-424 






adult kidney 


Invitrogen 


AKT002 


3 6 


14 20-21 25 


-26 7 


6 79 85 


89 94 101 111 








114 


118 


121 


124 


126 


130- 


•131 


138 


146 


163 








170 


177- 


•178 


189 


196 


198 


201 


204 


213 


231 








253- 


-254 


256- 


-259 


271 


273- 


•275 


277 


298 


315 








320 


329 


342 
















adult lung 


GIBCO 


ALG001 


4 29 74 


79 85 90 96 


105 


111 


119 


132 


134 








136 


142 


144 


149 


159 


181 


189 


198 


200 


205- 








207 


226 


255 


257 


263 


283 


294 


300 


302- 


-303 








328 


358- 


■359 


365 


426 












lymph node 


Clontech 


ALN001 


6 16 31 


105 


120 


215 


257 


295 


306 


309 


359 


young liver 


GIBCO 


ALV001 


10-11 25-26 


29 31 33 


76 


85 95 115 121-122 








1 OA 


126 


130 


143 


146 


156 


158 


164 


178 


182 








187 


189 


229 


248 


253- 


254 


261 


278 


283 


304 








342 


375 


















adult liver 


Invitrogen 


ALV002 


10- 


L2 23 


26 


31 33-34 38 


53 


56 90-92 


94-95 








118 


121 


124 


128 


-129 


138 


141 


146 


148 


153 








156 


161 


171 


178 


198 


216 


232 


248 


253- 


-254 








256 


-257 


264 


302 


306 


365 


375 


383 


396 




adult liver 


Clontech 


ALV003 


10- 


LI 156 171 188 












Ovary 


Invitrogen 


AOV001 


3-8 


10-11 14 16 


19-22 24 27 


-31 34 36 57 73 








75- 


76 81-82 


85 89-91 94- 


•98 


L04-109 111 








115 


-116 


121- 


-128 


130- 


131 


134 


136 


138- 


-139 








141 


143- 


144 


146 


149- 


150 


152 


155 


157- 


-160 








163 


-166 


170- 


-173 


175 


177- 


•178 


180 


182 


184- 








187 


189- 


•190 


193 


-194 


196- 


•197 


200- 


-201 


212- 








213 


215 


217 


222 


225- 


226 


228 


230- 


-233 


235 








241 


-243 


245 


248 


253- 


•259 


261 


266 


-267 


270 








272 


-273 


276-278 


283- 


•285 


287 


289 


292 


297- 








299 


305- 


■306 


315 


-317 


319 


323 


-325 


329- 


-331 








341 


343- 


-344 


352 


358- 


•359 


363 


-366 


382- 


-383 








386 


389- 


•390 


412 














Placenta 


Invitrogen 


APL001 


73 


92 117 135 182 194 232 246 261 272 282 








359 




















placenta 


Invitrogen 


APL002 


16 


28 92 121 135 144 157 178 210 394 


adult spleen 


GIBCO 


ASP001 


3-4 


16 32-33 35 


90 96 99-100 123-125 128 








131 


134 


136 


139 


151 


178 


181 


189 


194 


200 








210 


218 


229 


251 


253- 


■255 


257 


276 


283 


307- 








309 


315 


329 


354 


-355 


357 


392 


400 




/ 


testis 


GIBCO 


ATS001 


22 


73 82 91 


96- 


97 104-105 117 124 130 134 








164 


173 


200 


209 


222 


233 


241 


253 


-254 


257 








285 


287- 


•288 


305 


325 


329 


351 


-353 


359 




bladder 


Invitrogen 


BLD001 


4 108 130 150 212 226 236 240 242 257 276 








287 


305 


395- 


-396 


415 












bone marrow 


Clontech 


BMD001 


1 4 


-5 22 29- 


-30 


34 72 85 


88 


90 92 94 


98 








104 


-107 


109 


111 


113 


117 


120 


123 


-125 


128- 








129 


132 


135 


140 


142 


144 


146 


152 


163 


165- 








166 


170- 


•173 


177 


180 


182 


186 


189 


-190 


198- 








209 


215 


222 


225 


232 


240- 


-246 


251 


-252 


260- 








261 


273- 


-275 


277 


-280 


283- 


-285 


300 


316 


318 








346 


-347 


359 
















bone marrow 


GF 


BMD002 


1 4 


7-8 


10-11 16 19 


25 31 49 61 


-62 72 74 








76 


80 85 88 


90 


93-95 97- 


-101 


109 


-110 


112 








114 


116- 


-117 


121 


126 


129 


132 


135 


141 


144 
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Table 1 



Tissue Origin 


KNA/ Tissue 


Library 


SEQ ID NO: 




Source 


Name 










146 149-150 154 157 160 162-163 165-166 








170-172 175 178-180 182-183 186-190 192- 








194 198-200 203 208 210-213 215 223 225 








234 242 245 247 251-254 256-257 265 270 








273 276-278 280 285 287 289 291 293-294 








299 302 307 309 315 322 324 337-338 353 








356-357 359 367 369 388 407 414 419 426 








434 


bone marrow 


Clonetech 


BMD007 


144 


♦Mixture of 


VARIOUS 


CGdOlO 


1 34-35 95 152 161 171 182 206 219 242 260 


16 tissues - 


VENDORS 




267 276 280 288 297 300 315-316 412 


mRNA 








♦Mixture of 


Various 


CGdOll 


45 51 167 188 216 251-252 


16 tissues - 


Vendors 






mRNA 








♦Mixture of 


Various 


CGd012 


2 10-11 18-21 29 31 34-35 40 42-43 45 48 


16 tissues - 


Vendors 




50-52 69-71 87-89 94-95 98-105 109 111-113 


mRNA 






117 120 123 125 127 131 135-136 138 146 








158 163 165-169 175 180 187-188 191 198 








201 208 216 219-221 224 226 234 236 238- 








239 241-246 251-252 260 264 270 276-277 








279 281 283-284 287 295-296 314 319 321 








327-328 331 333-334 337-341 343 351-352 








361 365 369 379-380 387 389 395 397-399 








402 406 410-412 417 419 424 426 431-433 


♦Mixture of 


Various 


CGd013 


29 48 101 146 167-169 187 219 234 327 333 


16 tissues - 


Vendors 




339 341 365 412 433 


mRNA 








♦Mixture of 


Various 


CGd015 


29 86 90 95 98 110 113 118 132 158 171 184 


16 tissues - 


Vendors 




193 218-220 243 284 310 385 410 419 


mRNA 








♦Mixture of 


Various 


CGd016 


3-4 20-21 29 38 85 88-89 95 105 119 122 


16 tissues - 


Vendors 




131-133 140 185 211-212 225 256-257 273 


mRNA 






276 302 318 379-380 390 400 419 


colon 


Invitrogen 


CliNOOl 


4 25 33 85 138 146 148 158-159 198 210 229 








301 360 384 397 


cervix 


BioChain 


CVX001 


3 5 10-11 18 20-21 24-25 29 36 41 47 57 63 








72 74 76 86 90 94 104 108-109 111 125 127 








130 134 138 144 147 162 174 178-179 182 








186 189 193 197 211 222 225-226 228 232 








241 243 257 261 267 270 273-275 278-281 








288-289 298 301-302 305 315 319 324-325 








329 331 337-338 359 391-392 395 420 


enao cne j. 1 a J. 


Strategene 


crvnA ft ■! 

EDT001 


O £ 1 ft «l ft ft Jl ft ^7 A ft *7 1 ft r *7 ft Oft O C ft ft r\ C 

3-6 18-19 24 27-29 35 72 76 79-80 85 89 96 


cells 






98 104-107 111 117 119-121 124-131 134 136 








138-139 141 144 146-147 149 152 158-159 








166-167 170-173 178-179 182-183 186-187 








191 193-194 196-197 200 210-211 222-224 








226 231-232 236 241 243 246 248 253-256 








258-259 276 279 282 287 292 300 302-303 








315 329 337-338 358-362 382-383 385-388 


esophagus 


BioChain 


ESO002 


257 


fetal brain 


Clontech 


FBR001 


34 


fetal brain 


Clontech 


FBR004 


3 139 144 271 284 337-338 


fetal brain 


Clontech 


FBR006 


4 6-11 14 18-21 24 28 31 37-38 40 63 76 85 








87 89-90 94-95 97 105 108-109 112-113 115 
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Table 1 



Tissue Origin 



RNA/Tissue 
Sourch 



Library 
Name 



SEQ ID NO: 



117-120 127 
170 172 175 
199 201 203 
232-233 240 
281 288-289 
330-331 356 
380 383 389 
419 421 423 



-130 133 138 
180 182 186- 
209-210 215 
243 245 253- 
292 295 304 

-357 359r360 
397 399-401 



140 144- 
188 190 
219 222 
255 270 
315 317 
364 367- 
408-409 



146 148 
192 194 
229-230 
273 276 
319 324 
368 379- 
411 413 



fetal brain 



Invitrogen 



PBT002 



2 14 19 23 28 31 90 94 105 121 124 126 131 

135 139 142 149 158 186 193 198 210 214- 

215 232 239 242 248 255 267 326 332 365 

369 371 376-383 394 399 



fetal heart 



Invitrogen 



FHR001 



4 7-8 10-11 14 17-21 28-29 31-32 60 64-65 
73 85 87 92 95 102-103 105 108 111 113 117 
119 121 125 128-129 134-135 141 152 154 
156-157 160-161 172 176 178 194 196 198- 
200 203 208 212 215 218 222 226 229 233- 
234 253-257 261 265 272 276 281 292-293 
295 303 305 319 325 327 337-338 341 345 
349 354-355 367-368 389 395-396 398 412 
417 436 



fetal kidney 



Clontech 



FKD001 



1 14 22 94 110 115 132 134-135 146 178 189 
199 235-236 242 247 257 267 292 295 359 



fetal kidney 



Clontech 



FKD002 



22 31 38 40 46 94 122 127 131 156 160 194 
198 229 253-254 270 292 303 319 354-355 
389 396 



fetal kidney 



Invitrogen 



FKD007 



303 



fetal lung 



Clontech 



FLG001 



85 89 98-100 111 175 271 281 369 



fetal lung 



Invitrogen 



FLG003 



84 88 106-107 122 135 140 146 160 181 246 
272 284 292 328 330 396 404 416 426 



fetal liver - 
spleen 



Soares 



FLS001 



1-3 6-12 14 19 23 28-31 33 57 59-60 72-76 
78 80 83 85-138 140-141 143-144 146-155 
157-161 163-197 200 204 208 210-211 223 
225 230 232-233 235 241-243 245-266 268- 
273 277 281 285-287 292 297 303 314 329 
343 346-347 357-359 369 397 399 407 415 



fetal liver- 
spleen 



Soares 



FLS002 



1 3-4 6 10-12 23-24 29 31-33 35-37 53-54 
74-76 79 81-82 86-89 91 94-95 99-104 106- 
109 111-112 115 117-120 122 125-126 128- 
129 132 134 136-138 141 146 149 153 157- 
159 162-166 170 172 175 178-180 183 185- 
191 194 196-197 205 207-212 222-225 228 
232-233 239-241 248 251-252 255-256 258- 
259 261-262 264 266-267 270-271 273-275 
277-278 283 285 287 298 305 315 317-318 
322 330-332 337-338 341 343 349 357-360 
365 388 390-391 399 402 418 424 



fetal liver- 
spleen 



Soares 



FLS003 



12 29 91 98 111 119 156 163 165 178 186 
193 210-211 276 286 315 322 346-347 357 
365 424 



fetal liver 



Invitrogen 



FLV001 



7-8 14 35 118 122-123 129 146 182 211 230 
232 248 251-252 264 287 304 337-338 344 
346-347 352 365 367-369 



fetal liver 



Clontech 



FLV002 



102-103 147 149 300 



fetal liver 



Clontech 



FLV004 



73 85 105 108 118 122 126 141 156-157 161 
165 170 178 180 182 194 215 218 225 240 
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Table 1 



Tissue Origin 


RNA/Tissue 


Library 


SEQ 


ZD NO: 


















Source 


Name 




























242 


247 


251- 


•252 


292 


330 


337- 


•338 


369 


407 








411 


440 


















fetal muscle 


Invitrogen 


FMS002 


5 9 


17-18 20-21 


29 38 85 


88 


97 106-107 129 








131 


136 


150- 


•152 


155 


165 


170 


179 


182 


192- 








193 


212- 


-213 


229 


234 


242 


258- 


-259 


270 


282 








286 


289 


300 


316 


319 


345 


351 


354- 


•355 


360 








389 


396 


408 


410 


437 


439 










fetal skin 


Invitrogen 


FSK001 


2 4 


7-8 


29 33 42-43 


49 51-52 58 


74 82 85 








90 94 110-111 116 118 121 133 136 138-139 








145 


151 


154 


156- 


•157 


161- 


162 


172 


181 


184 








186 


193 


198 


200 


205 


207 


209- 


•211 


222 


227- 








230 


232 


235 


240 


246 


253- 


•257 


266 


270 


276 








292 


295 


299 


316 


318 


323 


330 


332 


337- 


-340 








343 


357 


369 


389 


394- 


-395 


412 


422 


427 




fetal skin 


Invitrogen 


FSK002 


4 9 


42 44 51 66 


72 81 85 89- 


•90 95 98 105 








112- 


-114 


119 


121 


129 


133 


135 


162 


172 


179- 








182 


197 


200 


208 


210 


231 


243- 


-244 


272 


304 








316 


330 


339 


354- 


-355 


357 


360 


389 


395 


410 








417 


437 


















fetal spleen 


BioChain 


FSP001 


157 


223 


















umbilical 


BioChain 


FUC001 


4-6 


20-21 25 29 


73-74 83 87 


89-91 94 101 


cord 






109 


120 


123 


125 


128 


130- 


•131 


133 


141 


143- 








144 


147 


149 


154 


161 


165 


173 


175 


179 


184 








188 


210- 


-212 


217 


226 


235 


240 


248 


251- 


-252 








257 


262 


267 


270 


277 


293 


305 


307 


316 


319 








323 


327 


331 


341 


356 


359 


389 


392 


407 


416 


fetal brain 


GIBCO 


HFB001 


2-4 


16 20-21 74 


77 85 89-91 


96-98 104-105 








111 


114 


118 


121- 


-122 


124- 


•125 


127- 


-128 


131 








134 


137- 


-140 


142 


144 


146- 


•148 


151 


153 


158- 








159 


163- 


-164 


166 


173 


178 


180 


182 


191 


194 








196 


200 


203 


209- 


-214 


216- 


-232 


234- 


-236 


238- 








239 


243 


253- 


-255 


263 


270 


272- 


-273 


276 


281 








292 


310 


316 


319- 


-321 


332 


348 


357 


359 


365 








399 




















macrophage 


Invitrogen 


HMP001 


2 247 


infant brain 


Soares 


IB2002 


2-4 


7-8 


19-22 26-27 


31-32 35 73- 


-74 80 85 








89 91 96-98 


106- 


-107 


110 


112 


118- 


-119 


121- 








122 


125 


128-131 


134-144 


148 


153 


164 


166 








172- 


-173 


177 


180 


186 


-187 


191- 


-194 


196 


202- 








203 


208 


-210 


217 


219 


223- 


-224 


227 


229 


232- 








234 


236 


-237 


239 


241 


-243 


245 


248 


253< 


-259 








273 


-275 


278- 


-279 


282 


287 


294 


298 


309 


314 








317 


322 


327 


330 


333 


-334 


341 


348- 


-350 


360 








368 


376 


379-380 


382 


396 


406 


424 






infant brain 


Soares 


IB2003 


3-4 


20- 


21 26 28 


31 


35 73 85 


95-96 110 113 








119 


122 


-123 


130- 


-131 


135 


138 


140 


142 


-143 








146 


153 


155 


170 


172 


-173 


.186 


191 


-193 


196 








209 


219 


223 


226 


229 


233- 


-234 


236 


239 


245 








248 


253 


-254 


256- 


-257 


273 


279 


291 


-292 


304 








314 


337 


-338 


343 


359 


367 


371 


376 


397 


413 


lung, 


Strategene 


LFB001 


3 6 


31 


72-73 90 


92 


105-107 124 


L26- 


127 133 


fibroblast 






136 


139 


144 


146 


172 


189 


198 


204 


233 


235 








246 


258 


-259 


268 


272 


276 


282 


310 


335 


359 








434 




















adult lung 


Invitrogen 


LGT002 


4 19-21 


28 . 


33 35-36 


49 72 79 81 


85 


88 90- 








91 


94-95 101 106-107 109 118 120-125 127 [ 
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130- 


-131 


133 


135- 


-138 


141- 


-142 


144 


147 


149 








157 


159- 


•161 


163 


166 


170- 


•173 


193- 


-194 


196- 








197 


212 


216 


218 


221 


223 


226 


228- 


-229 


231 








233 


241 


247-248 


253- 


-255 


257 


261 


266 


-267 








270- 


-275 


277- 


-278 


282- 


-283 


292 


298 


301 


303 








315 


318 


324 


331 


335 


354- 


•355 


359 


367 


369 








381 


392- 


-393 


398 














leukocytes 


GIBCO 


LUC001 


1-5 


15 19-21 28 


30-33 37 72 


74 91 94-95 








97-100 108-109 113 115 117 119-122 124-125 








127- 


-128 


134- 


-138 


141 


144 


146 


-148 


150- 


-151 








157-158160 162-167 170-173 175-178 180-181 








187 


189 


192 


194 


197 


200 


212 


-213 


215- 


-216 








218- 


-219 


223 


225 


228- 


-232 


241- 


-242 


245- 


-246 








251- 


-254 


261 


272 


-276 


278- 


•282 


284 


2.87- 


-290 








297- 


-298 


305 


307 


310- 


-314 


325 


331 


336 


340 








358- 


-359 


372 


399 


414 












leukocytes 


Clontech 


LUC003 


1 5 


124 


171 


176 


204 


225 


248 


253- 


-254 


283 








285 


307 


315 
















melanoma 


Clontech 


MEL004 


4-5 


24 37 72-74 


81 85 106-107 113 136 177 








203 


205- 


-207 


209 


231 


243 


284 


-285 


315- 


-316 








320 


326 


359 


374 


428 












mammary gland 


Invitrogen 


MMG001 


2 4- 


-5 7- 


-8 10-12 


29 31 34-35 


38 50 80-81 85 








89-90 92 94 


-97 105 108-109 


119-124 126 








128- 


-130 


135 


138 


-139 


141- 


■142 


144 


146- 


-147 








153 


155 


157 


-159 


163 


178- 


•179 


181- 


-182 


198 








200 


209- 


-210 


219 


223 


228 


230 


232- 


-233 


235- 








236 


239 


242 


248 


253- 


-255 


257 


260- 


•261 


265- 








267 


270 


272 


281 


287 


292 


294 


315- 


-316 


318 








324 


327 


330 


337- 


-340 


354- 


•355 


357 


369 


372 








383 


392- 


■395 


401 


404 












neuron 


Strategene 


NTD001 


35 47 89-90 


111 


118 


164 


232 


253- 


-254 


276 








324 


331 


382 
















neuron 


Strategene 


NTR001 


20-21 37 122 147-149 170 179 181 186 212 








226 


258-259 


265 


276 


369 


436 


438 






neuronal 


Strategene 


NTuOOl 


7-8 


37 55 80 85 


112 


118 


126 


-127 


133 


138 


cells 






140 


-141 


151 


170 


181 


210 


214 


•225- 


-226 


236 








243 


287 


328 


330 


-331 


357 


383 


400 


436 




pituitary 


CJLontecn 


P1TQ04 


92 124 159 231 














giana 


; 
























placenta 


Clontech 


PLA003 


34 46 88' 126 128 159 182 186 197 201 267 








278 


281- 


-282 


305 


330 


356 


361 


365 


418 




prostate 


Clontech 


PRT001 


18 36 72 74 


86 


95 106-107 111*118 122 144 








161 


179 


211 


218 


233 


286 


297 








y«£\/i ■♦"11 m 


lnvx troyen 


rUSV-UUl 


9 31 85 


121 


128 


147 


171 


200 


219 


257 


292 








340 


394 


398 


407 


412 












salivary 


Clontech 


SAL001 


3 24 38 


80 


122 


136 147 189 241 282 296 310 


gland 






351 


392 


395 


415 














saliva gland 


Clontech 


SALS 03 


118 


small 


Clontech 


SIN001 


12 16 25 82 


-83 


B9-90 93 


95 


98 105-109 111 


intestine 






122 


-123 


125 


-128 


133-134 


137 


139 


142 


161 








167 


171 


184 


197 


201 


204 


212 


218 


236 


242- 








243 


248- 


-249 


253 


-254 


257 


267 


276 


284 


-285 








292 


297 


300 


303 


310 


313 


317 


-318 


325 


340 








343 


352 


354 


-355 


359 


383 


391 


416 






spinal cord 


Clontech 


SPC001 


3 39 84 


86 


94 96 105 115 117 130-131 134 








136 


141 


143 


148 


155 


176 


190 


-191 


203 


213 



128 
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224 


233- 


-234 


236 


239 


279 


283 


298 


320- 


-321 








332 


336- 


-338 


356 


359 


365 


404-406 






thalamus 


Clontech 


THA002 


2 20-21 


23 74 81 85 


105- 


-106 


116 


121 


131 








146 


171 


185 


188 


200 


209 


219 


233 


239 


256 








258- 


-259 


273 


276 


362 


399 










thymus 


Clonetech 


THM001 


16 29 33 57 


80 82 85 90 


93-94 106-107 120 








126 


128 


134 


141 


161 


176 


194 


223 


228 


235 








253- 


-254 


261 


274- 


-275 


278 


285 


298 


319 


332 








336 


343 


353 


359 


425 












thymus 


Clontech 


THMC02 


1-2 


7-9 


14 26 34 44 


73 75 82 85 


87 94 98 








106- 


-107 


109 


-111 


117 


119- 


-120 


125- 


-126 


128- 








129 


139 


141 


144 


147- 


-148 


151 


154- 


-155 


162 








165 


170- 


-172 


175- 


-176 


179 


182 


186 


193- 


-194 








199 


-200 


208 


-209 


213 


218 


233 


235 


240 


242 








247 


253- 


-254 


257 


265 


276 


281 


287 


290 


305 








307 


312 


319 


336 


342 


354 


-356 


359 


364 


367 








399 


408 


412 


-413 


415 


419 


421 


426 


429- 


-433 


thyroid gland 


Clontech 


THRO 01 


3 5 


7-8 


28 30-31 33 


73-77 80 82 


85 88 90- 








92 94 96-98 


105- 


-107 


109 


113 


117 


121-122 








124 


-125 


127 


-128 


130 


134 


136 


141 


143 


146- 








148 


152 


161 


-163 


166 


175 


177- 


-178 


181 


194 








199 


201 


204 


210 


212 


216 


218 


223 


-226 


228 








230 


-231 


234 


236 


241 


243 


246 


253 


-257 


261 








270 


272 


-273 


276 


-278 


281 


-283 


287 


292 


295 








298 


303 


-304 


308 


315 


323 


329 


335 


352 


359 








362 


401 


416 


-417 














trachea 


Clontech 


TRC001 


88 


138 


L80 


226 


228 279 


359 411 


436 




uterus 


Clontech 


UTR001 


3 10-11 


23 


77 92 106-107 109 111 141 197- 








198 


218 


241 


257 


270 


274 


-275 


302 


315 


329 








396 


400 


413 

















*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA 
(Ihvitrogen), 4) Normal adult liver mRNA (Invitrogen), 5) Normal fetal kidney mRNA 
(Invitrogen), 6) Normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA (Ihvitrogen), 8) 
human adrenal gland mRNA (Clontech), 9) Human bone marrow mRNA (Clontech), 10) Human 
leukemia lymphoblastic mRNA (Clontech), 11) Human thymus mRNA (Clontech), 12) human 
lymph node mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 14) human thyroid 
mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human conceptional umbilical 
cord mRNA (BioChain). 
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1 




tiomo sapiens 


membrane-associated nucleic acid 
Dinaing protein rnKJN a, partial cds. 


2553 


54 


1 


gi7020305 


Homo sapiens 


cDNA FU20301 fis, clone HEP06569. 


1728 


47 


1 


gi7294120 


Drosophila 
melanogaster 


CG16807 gene product 


1535 


53 


2 


AAY57911 


Homo sapiens 


Human transmembrane protein 
HTMPN-35. 


1258 


82 


2 


AAB88406 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0162. 


265 


39 


2 


gil4272664 


Homo sapiens 


unnamed protein product 


265 


39 




gil2654575 


Homo sapiens 


Similar to gp25L2 protein, clone 
MGC:2142 IMAGE:2967520, mRNA, 
complete cds. 


1116 


100 


3 


gil2845568 


Mus musculus 


putative 


1099 


98 


3 


gi996057 


Homo sapiens 


H.sapiens mRNA for gp25L2 protein. 


1096 


98 | 


4 


&9971050 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-526K24 on chromosome 20. 
Contains a novel gene, the 5* end of a 
novel gene, two CpG islands, ESTs, 
GSSs and STSs, complete sequence. 


4348 


99 


4 


AAB95086 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16999. 


3034 


99 


4 


gil0433753 


Homo sapiens 


cDNA FLJ12307 fis, clone 
MAMMA1001908. 


3034 


99 


5 


gi4689106 


Homo sapiens 


NADH-ubiquinone oxidoreductase B8 
subunit 


505 


100 


5 


gi2909862 


Homo sapiens 


NADH-ubiquinone oxidoreductase 
subunit CI-B8 mRNA, complete cds. 


505 


100 


5 


gil2539408 


Homo sapiens 


NDUFA2 gene for NADH 
dehydrogenase (ubiquinone) 1 alpha 
subcomplex 2, complete cds. 


505 


100 


6 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3765 


100 


6 


gii0443046 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
novel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc finger protein similar to 
chicken FZF-1, a Ferritin light 
polypeptide (FTL) pseudogene, the 
MMP9 gene for matrix 
metalloproteinase 9 (gelatinase B, 
92kD gelatinase, 92kD type IV 
collagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassium-chloride 
transporter) member 5 (KIAA1176) 
and the 3* end of gene KIAA1637, 
complete sequence. 


3765 


100 


6 


gil5426514 


Homo sapiens 


clone MGC:16205 IMAGE:3640928, 
mRNA, complete cds. 


3765 


100 


7 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3366 


100 
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7 


gil0443046 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
novel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc ringer protein similar to 
chicken FZF-1, a Ferritin light 
polypeptide (FTL) pseudogene, the 
MMP9 gene for matrix 
metalloproteinase 9 (gelatinase B, 
92kD gelatinase, 92kD type IV 
collagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassium-chloride 
transporter) member 5 (KIAA1 176) 
and toe 3 end oi gene KJAA 1637, 
complete sequence. 


3366 


100 


7 


gil5426514 


Homo sapiens 


clone MGC:16205 IMAGE:3640928, 
mRNA, complete cds. 


3366 


100 


8 


gil4571904 


Rattus 
norvegicus 


lysosomal amino acid transporter 1 


2145 


85 


8 


AAE04910 


Homo sapiens 


Human transporter and ion channel-23 
(TRICH-23) protein. 


1239 


56 


8 


gi7297404 


Drosophila 
melanogaster 


CG13384 gene product 


837 


43 


9 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. 


1301 


98 


9 


gi7291405 


Drosophila 
melanogaster 


T3dh gene product 


808 


59 


9 


gi5824752 


Caenorhabditis 
elegans 


predicted using Genefinder-contains 
similarity to Piam domain: PF00465 
(Iron^ontaining alcohol 
dehydrogenases), Score=177.7, E- 
value=1.9e-50, N=2~cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl8d4.3 comes from this 
gene-cDNA EST ykl8d4.5 comes 
from this gene; cDNA EST ykl 1 6£5.5 
comes from this gene~cDNA EST 
yk!32h3.3 comes from this gene; 
cDNA EST yk73dl0.3 comes from this 
gene-cDNA EST yk93e9.3 comes from 
mis gene; cDNA EST ykl32h3.5 
comes from this gene-cDNA EST 
yk73dl0.5 comes from this gene; 
cDNA EST yk93e9.5 comes from this 
gene-cDNA EST ykl35b6.5 comes 
from this gene; cDNA EST ykl35b6.3 
comes from this gene-cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl.3 comes from this 
gene-cDNA EST yk26ld63 comes 


685 


52 
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from this gene; cDNA EST yk262hl 1.3 
comes from this gene~cDNA EST 
yk292hl 1.3 comes from this gene; 
cDNA EST yk304d8.3 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk35 la6.3 
comes from this gene~cDNA EST 
yk366d9.3 comes from this gene; 
cDNA EST yfc368e3.3 comes from this 
gene-cDNA EST yk372cl 1.3 comes 
from this gene; cDNA EST yk389g3.3 
comes from this gene~cDNA EST 
yk422d2.3 comes from this gene; 
cDNA EST yk381d73 comes from this 
gene-cDNA EST yk201e5.5 comes 
from this gene; cDNA EST yk267f6.5 
comes from this gene-cDNA EST 
yk268bl.5 comes from this gene; 
cDNA EST yk261d6.5 comes from this 
gene-cDNA EST yk262hll.5 comes 
from this gene; cDNA EST yk292hl 1.5 
comes from this gene-cDNA EST 
yk304d8.5 comes from this gene; 
cDNA EST yk344b7.5 comes from this 
gene-cDNA EST yk368e3.5 comes 
from this gene; cDNA EST yk372cl 1.5 
comes from this gene~cDNA EST 
yk351a6.5 comes from this gene; 
cDNA EST yk366d9.5 comes from this 
gene-cDNA EST yk389g3.5 comes 
from this gene; cDNA EST yk422d2.5 
comes from this gene-cDNA EST 
yk560f4.3 comes from this gene; 
cDNA EST yk625h5.3 comes from this 
gene-cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene-cDNA EST 
yk625h5.5 comes from this gene 






10 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. 


1552 


99 


10 


gi7291405 


Drosophila 
melanogaster 


T3dh gene product 


891 


56 


10 


gi5824752 


Caenorhabditis 
elegans 


predicted using Geneflnder~contains » 
similarity to Pfam domain: PF00465 
(Iron-containing alcohol 
dehydrogenases), Score=177.7 > E- 
value«1.9e-50, N=2~cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl8d4.3 comes from this 
gene-cDNA EST ykl8d4.5 comes 
from this gene; cDNA EST ykl 166.5 
comes from this gene-cDNA EST 
ykl32h3.3 comes from this gene; 
cDNA EST yk73dl0.3 comes from this 


730 


51 
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gene-cDNA EST yk93e9.3 comes from 
this gene; cDNA EST yk!32h3.5 
comes from this gene-cDNA EST 
yk73dl0.5 comes from this gene; 
cDNA EST yk93e9.5 comes from this 
gene-cDNA EST ykl35b6.5 comes 
from this gene; cDNA EST ykl35b6.3 
comes from this gene-cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl.3 comes from this 
gene-cDNA EST yk261d6.3 comes 
from this gene; cDNA EST yk262hl 1.3 
comes from this gene~cDNA EST 
yk292hll.3 comes from this gene; 
cDNA EST yk304d8.3 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk351a6.3 
comes, from this gene~cDNA EST 
yk366d9.3 comes from this gene; 
cDNA EST yk368e3.3 comes from this 
gene~cDNA EST yk372cl 1 .3 comes 
from this gene; cDNA EST yk389g3.3 
comes from this gene~cDNA EST 
yk422d2.3 comes from tins gene; 
cDNA EST yk381d7.3 comes from this 
gene-cDNA EST yk201e5.5 comes 
from this gene; cDNA EST yk267f6.5 
comes from this gene-cDNA EST 
yk268bl.5 comes from this gene; 
cDNA EST yk261d6.5 comes from this 
gene-cDNA EST yk262hl 1.5 comes 
from this gene; cDNA EST yk292hl 1.5 
comes from this gene~cDNA EST 
yk304d8.5 conies from this gene; 
cDNA EST yk344b7.5 comes from tins 
gene-cDNA EST yk368e3.5 comes 
from this gene; cDNA EST yk372cl 1 .5 
comes from this gene~cDNA EST 
yk351a6.5 conies from this gene; 
cDNA EST yk366d9.5 comes from this 
gene-rcDNA est yK389g3.5 comes ■ 
from this gene; cDNA EST yk422d2.5 
comes from this gene-cDNA EST 
yk560f4.3 comes from this gene; 
cDNA EST yk625h5.3 comes from this 
gene-cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene~cDNA EST 
yk625h5.5 comes from this gene 






11 


AAB85166 


Homo sapiens 


Human Bcl-Gl polypeptide. 


1598 


87 


11 


gil4598300 


Homo sapiens 


unnamed protein product 


1598 


87 


11 


gil2584085 


Homo sapiens 


apoptosis regulator BCL-G long form 
(BCLG) mRNA, complete cds. 


1598 


87 


12 


gil5077865 


Mus musculus 


bullous pemphigoid antigen 1-b 


1253 


82 
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12 


gil5077863 


Mus musculus 


bullous pemphigoid antigen 1-a 


1253 


82 


12 


gi6624582 


Homo sapiens 

• 


Human DNA sequence from clone 
RP1-61B2 on chromosome 6pl 1.2-12 J 
Contains isofonns 1 and 3 of BPAG1 
(bullous pemphigoid antigen 1 
(230/240kD), an exon of a gene similar 
to murine MACF cytoskeletal protein, 
STSs and GSSs, complete sequence. 


733 


99 


13 


gi3702270 


Homo sapiens 


chromosome 19, cosmid R31408, 
complete sequence. 


887 


93 


13 


gi401845 


Homo sapiens 


ribosomal protein LI 8a mRNA, 
complete cds. 


887 


93 


13 


gii3960144 


Homo sapiens 


ribosomal protein LI 8a, clone 
MGC:4476 IMAGE:2961519, mRNA, 
complete cds. 


887 


93 


14 


AAB59090 


Homo sapiens 


Breast and ovarian cancer associated 
antigen protein sequence SEQ ID 798. 


496 


80 


14 


AAB44129 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1574. 


453 


81 


14 


gil4198321 


Mus musculus 


ribosomal protein L3 1 


453 


81 


15 


gi5689465 


Homo sapiens 


mRNA for KIAA1064 protein, partial 
cds. 


5643 


100 


15 


* i no ^ c\ 

gi4884368 


Homo sapiens 


mRNA; cDNA DKFZp586L1220 
(from clone DKFZp586L1220); partial 
cds. 


1628 


100 


15 


gil3161145 


Homo sapiens 


zinc finger protein mRNA, complete 
cds. 


369 


36 


16 


gi5870832 


Mus musculus 


skm-BOPl 


2494 


94 


16 


gi5870834 


Mus musculus 


skm-BOP2 


2397 


91 


16 


gil 809322 


Mus musculus 


t-BOP 


2285 


93 


17 


gil3938126 


Mus musculus 


RKEN cDNA 3732409C05 gene 


2678 


98 


17 


gil2852375 


Mus musculus 


putative 


2678 


98 


17 


gi7024433 


Torpedo 
marmorata 


male sterility protein 2-like protein 


2307 


80 


18 


AAB95482 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18007. 


1572 


67 


18 


gil4042809 


Homo sapiens 


cDNA FU14932 fis, clone 
PLACE1009639. 


1572 


67 


18 


gil2053165 


Homo sapiens 


mRNA; cDNA DKFZfc434K0427 
(from clone DKFZp434K0427); 
complete cds. 


1572 


67 


19 


gi7243159 


Homo sapiens 


mRNA for KIAA1389 protein, partial 
cds. 


7842 


99 


19 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
alpha mRNA, complete cds. 


3777 


53 


19 


gi4151330 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
beta mRNA, complete cds. 


3768 


53 


20 


gi7243159 
i 


Homo sapiens 


mRNA for KIAA1389 protein, partial 
cds. 


7714 


98 1 


20 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 


3806 


54 



134 



WO 02/081731 



PCT/US02/01222 



Table 2 



aEQ ID NO: 


Accession No.- 


Species 


Description 


Score 


% 
Identity 








alpha mRNA, complete cds. 






20 


• iff lOOA 

gi4151330 


Homo sapiens 


high-risk human papilloma viruses £6 
oncoproteins targeted protein E6TP1 
beta mRNA, complete cds. 


3797 


53 


21 


AAB95328 


Homo sapiens 


Human protein sequence SBQ ID 
NO:17595. 


753 


61 


21 


AAB93757 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13432. 


753 


61 


21 


AAB29657 


Homo sapiens 


Human membrane-associated protein 
HUMAP-14. 


753 


61 


22 


gi7673373 


Homo sapiens 


SCAN-related protein RAZ1 (RAZ1) 
mRNA, partial cds. 


1104 


100 | 


22 


AAG93274 


Homo sapiens 


Human protein HP 10543. 


900 


100 


22 


AAB42846 


Homo sapiens 


Human ORFX ORF2610 polypeptide 
sequence SEQ ID NO:5220. 


900 


100 


23 


£7242963 


Homo sapiens 


mRNA for KIAA1304 protein, partial 
cds. 


5409 


99 


23 


gi3413874 


Homo sapiens 


mRNA for KIAA0456 protein, partial 
cds. 


3695 


67 


23 


AAB30852 


Homo sapiens 


Amino acid sequence of human signal 
transduction protein SGT6-1. 


3685 


68 


24 


AAG64386 


Homo sapiens 


Human alcohol dehydrogenase 39. 


1228 


77 


24 


gil2861800 


Mus musculus 


putative 


1083 


66 


24 


©3878713 


Caenorhabditis 
elegans 


weak similarity with quinone 
oxidoreductase, contains similarity to 
Pfam domain: PF00107 (Zinc-binding 
dehydrogenases), Score=-80.6, E- 
value=6.2e-06, N=l~cDNA EST 
ykl64b4.5 comes from this 
gene~cDNA EST ykl64b4.3 comes 
from mis gene-cDNA EST yk264G.5 
comes from mis gene 


556 


39 


25 


AAE02629 


Homo sapiens 


Human secreted protein Zalpha37. 


2481 


100 


25 


gil4536691 


Homo sapiens 


unnamed protein product 


2481 


100 


25 


AAY99419 


Homo sapiens 


Human PRO1780 (UNQ842) amino 
acid sequence SEQ ID NO:282. 


1960 


77 


26 


gi6102869 


Homo sapiens 


mRNA; cDNA DKFZ*>434H1235 
(from clone DKFZp434H1235); partial 
cds. 


831 


100 


zo 


gllzo3i439 


Mus musculus 


putative 


789 


94 


26 


gi2 198807 


Gallus gallus 


monocarboxylate transporter 3 


505 


29 


27 


gi7299069 


Drosophila 
melanogaster 


CGI 1755 gene product 


205 


34 


27 


gi3875367 


Caenorhabditis 
elegans 


contains 3 cysteine rich repeats 


136 


41 


27 


gi3249080 


Arabidopsis 

tlialiaTia 


Contains similarity to MYB 
transcription factor isolog T01O24. 1 
gb|2288980 from A. thaliana BAC 
gb|AC002335. 


69 


35 


28 


gil 1041628 


Homo sapiens 


RPL6 gene for ribosomal protein L6, 
complete cds. 


1207 


98 


28 ; 


gi433416 


Homo sapiens 


Human mRNA for DNA-binding 
protein, TAXREB107, complete cds. 


1207 


98 
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28 


gil3278717 


Homo sapiens 


ribosomal protein L6, clone 
MGC:1635 IMAGE:2823733, mRNA, 
complete cds. 


1207 


98 


29 


AAG03810 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
7891. 


845 


100 


29 


gil86800 


Homo sapiens 


Human ribosomal protein L12 mRNA, 
complete cds. 


845 


100 


29 


gi!4198333 


Homo sapiens 


ribosomal protein L12, clone 
MGC:9760 IMAGE:3 855674, mRNA, 
complete cds. 


845 


100 


30 


AAB95051 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 16849. 


2965 


100 


30 


gil0433519 


Homo sapiens 


CDNAFU12118 fis, clone 
MAMMA1 000085, weakly similar to 
PUTATIVE CYSTEINYL-TRNA 
SYNTHETASE C29E6.06C (EC 
6.1.1.16). 


2965 


100 


30 


gil3938199 


Homo sapiens 


hypothetical protein FLJ121 18, clone 
MGC: 15044 IMAGE:2822557, mRNA, 
complete cds. 


2959 


99 


31 


gil2858123 


Mus musculus 


putative 


2441 


73 


31 


gi7959195 


Homo sapiens 


mRNA for KIAA1467 protein, partial 
cds. 


2232 


100 


31 ! 


gil3278148 


Mus musculus 


Similar to RIKEN cDNA 8430419L09 
gene 


794 


83 


32 


gil5530305 


Homo sapiens 


Similar to RIKEN cDNA 1700045119 
gene, clone MGC:2647 
IMAGE:3509621, mRNA, complete 
cds. 


1245 


84 


32 


gi9858803 


Mus musculus 


Ztp228 


512 


47 


32 


AAG75629 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:6393. 


511 


46 


33 


gi8101071 


Homo sapiens 


golgin-like protein (GLP) gene, 
complete cds. 


312 


46 


33 


gi8099669 


Homo sapiens 


golgin-like protein (GLP) mRNA, 
complete cds. 


312 


46 


33 


gil 1037008 


Human 
herpesvirus 8 


latent nuclear antigen 


245 


40 


34 


gi437985 


Cards 
familiaris 


Rab 12 protein 


1071 


99 


34 


gi206531 


Rattus 
norvegicus 


RAB12 


995 


96 


34 


gil2851149 


Mus musculus 


putative 


819 


96 


35 


gil3543689 


Homo sapiens 


Sirnilar to RIKEN cDNA 4933405K01 
gene, clone MGC: 14799 
IMAGE:4068454, mRNA, complete 
cds. 


1077 


96 


35 


gil2805373 


Mus musculus 


Unknown (protein for MGC:7298) 


950 


84 


35 


gil2855529 


Mus musculus 


putative 


642 


79 


36 


gil2697979 


Homo sapiens 


mRNA for KIAA1717 protein, partial 
cds. 


1982 


100 


36 


gil651678 


Synechocystis 
sp.PCC6803 


ORFJ©:skl485~hypotheucal protein 


185 


34 
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36 


gi2739367 


Arabidopsis 
tfaaliana 


putative phosphatidylinositol-4- 
phosphate 5-kinase 


153 


28 


37 


gi3800892 


Homo sapiens 


neurexin Hi-alpha gene, partial cds. 


1255 


99 


37 


" gi294602 


Rattus 
norvegicus 


neurexin Ill-alpha 


1160 


91 


37 


gi205716 


Rattus 
norvegicus 


neurexin II-alpha-a 


561 


50 


38 


gil0047315 


Homo sapiens 


mRNA for KIAA1619 protein, partial 
cds. 


4447 


99 


38 


gi8217424 


Homo sapiens 


Human DNA sequence from clone 
RP1 M08L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
-protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein similar to rat tricarboxylate 
carrier, the gene for a novel PDZ 
(DHR, GLGF) domam protein, the 
gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, the gene for a 
novel protein similar to Plasmodium 
POM1 and C. elegans F46G1 1.1, a 
putative novel gene, the SEMA4G gene 
for semaphorin 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, complete 
sequence. 


4407 


99 


38 


gi4836757 


Mus musculus 


semaphorin subclass 4 member G 


4021 


90 


39 


ei 1043 8 664 


Homo sapiens 


CJJJNA: rU22324 ns, clone I 

xxp fencer i 


307 


100 


39 


gil3559240 


Homo sapiens 


Human DNA sequence from clone 
RP5-842G6 on chromosome 20. 
contains tne 5 end or a novel gene, the 
3* end of the gene for a novel protein 
similar to M1L1.L (sel-1 (suppressor of 
lin-12, C.elegans)-like), ESTs, STSs 
ana ooas, complete sequence. 


307 


100 


39 


gil3543669 


Homo sapiens 


hypouietical protein FLJ22324, clone 
MGC:14701 IMAGE:4247211, mRNA, 
complete cds. 


307 


100 


40 


gil4595019 


Homo sapiens 


mRNA for keratin 6 irs (KRT6IRS 
gene). 


2615 


99 


40 


gi6092075 


Mus musculus 


type II cytokeratin f 


2414 


91 


40 


gil5559584 


Homo sapiens 


Similar to keratin 6A, clone 
MGC:20671 IMAGE:3639270, mRNA, 
complete cds. 


1468 


57 




gil2655452 


Homo sapiens 


mRNA for keratin associated protein 
4.7 (KRTAP4.7 Rene). 


1157 


86 


41 


gil2655464 


Homo sapiens 


partial mRNA for keratin associated 
protein 4.15 (KRTAP4.15 gene). 


1090 


88 


41 


gil2655462 


Homo sapiens 


mRNA for keratin associated protein 
4.14(KRTAP4.14gene\ |_ 


1063 


84 
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42 


gi553772 


Homo sapiens 


Human Tcr-C-delta gene, exons 1-4; 
Tcr-V-delta gene, exons 1-2; T-cell 
receptor alpha (Tcr-alpha) gene, J1-J61 
segments; and Tcr-C-alpha gene, exons 
1-4. 


no 


100 


42 


gi4379087 


Homo sapiens 


mRNA for TCR alpha variable region, 
patient AF31. 


73 


46 


42 


AAW40057 


Homo sapiens 


Cellular transcriptional factor p300. 


71 


42 


43 


gil5866589 


Capsella 
rubella 


hypothetical protein 


97 


30 


43 


gi3879045 


Caenorhabditis 
elegans 


R102.6 


96 


34 


43 


AAY56133 


Homo sapiens 


Human N-methyl-D-aspartate receptor 
2 subunit SEQ ID NO:54. 


94 


52 


44 


gil3569345 


Homo sapiens 


pregnancy-associated plasma 
preproprotein-A2 mRNA, complete 
cds. 


9839 


99 


44 


gil0639043 


Homo sapiens 


mRNA for pregnancy-associated 
, plasma protein-E (PAPPE gene). 


8966 


99 


44 


gil 142970 


Homo sapiens 


Human pregnancy-associated plasma 
protein- A preprofonn (PAPPA) 
mRNA, complete cds. 


3856 


45 


45 


gil2851017 


Mus musculus 


putative 


578 


83 


45 


gi4490653 


Schizosacchar 
omyces pombe 


profilin. 


186 


35 


45 


gi440266 


Acanthamoeba 
castellanii 


profilinl 


166 


34 


46 


gil617480 


Comamonas 
testosteroni 


unknown 


712 


82 


46 


gi3046394 


Ralstonia 
eutropha 


phbF 


563 


66 


46 


gi6683782 


Burkholderia 
sp. DSMZ 
9242 


unknown 


560 


61 


47 


gi9229934 


Mus musculus 


midnolin 


2103 


78 


47 


AAB56832 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1410. 


912 


71 


47 


gil5929300 


Homo sapiens 


Similar to midnolin, clone 
IMAGE:3958934, mRNA, partial cds. 


907 


100 


48 


gil3377624 


Homo sapiens 


calicin mRNA, complete cds. 


3089 


99 


48 


gi854100 


Homo sapiens 


H.sapiens mRNA for calicin (partial). 


3076 


99 


48 


gi853784 


Bostaurus 


calicin 


2896 


91 


49 


AAB68411 


Homo sapiens 


Amino acid sequence of a human 
NOV2 polypeptide. 


2131 


100 


49 


AAY99407 


Homo sapiens 


Human PR01337 (UNQ692) amino 
acid sequence SEQ ID NO:236. 


2101 


99 


49 


AAB68414 


Homo sapiens 


Amino acid sequence of NOV2 
polypeptide clone TA-cgAL132708 A. 


2014 


99 


50 


gil2082748 


Mus musculus 


T-box transcription factor TBX18 


2972 


93 


50 


gi5102617 


Homo sapiens 


Human DNA sequence from clone 
33L1 on chromosome 6ql4.1-15. 
Contains the gene for novel T-box 
(Brachyury) family protein. Contains 


2634 


100 
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ESTs, STSs, GSSs and two putative 
CpG islands, complete sequence. 






50 


gil2849661 


Mus musculus 


putative 


2223 


96 


51 


gi 12843048 


Mus musculus 


putative 


339 


72 


51 


gi6691626 


Homo sapiens 


RAGE mRNA for advanced glycation 
endproducts receptor, complete cds. 


111 


32 


51 


gil90846 


Homo sapiens 


Human receptor for advanced 
glycosylation end products (RAGE) 
mRNA, partial cds. 


111 


32 


52 


AAG71840 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1521. 


1313 


85 


52 


AAG71839 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1520. 


1226 


81 


52 


AAG71837 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1518. 


1159 


77 


53 


AAB94026 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14163. 


966 


98 


53 


gil0433955 


Homo sapiens 


cDNAFLJ12457fis, clone 
NT2RM1000666, weakly similar to 
DNA-BINDING PROTEIN A. 


966 


98 


53 


gi7295442 


Drosophila 
melanogaster 


CG17334 gene product 


302 


47 


54 


gi8980396 


Homo sapiens 


mRNA for T-cell antigen receptor- 
alpha, clone Pil-la, partial 


566 


97 


54 


gi2358063 


Homo sapiens 


T-cell receptor alpha delta locus from 
bases 752679 to 1000555 (section 4 of 
5) of the Complete Nucleotide 
Sequence. 


565 


100 


54 


gi623149 


Macaca 
mulatta 


T-cell receptor alpha ! 


512 


85 


55 


gi2792496 


Rattus 
norvegicus 


tulip 2 


2437 


86 


55 


gi4884288 


Homo sapiens 


mRNA; cDNA DKFZp566D133 (from 
clone DKFZp566D133); partial cds. 


1983 


99 


55 


AAB41763 


Homo sapiens 


Human ORFX ORF1527 polypeptide 
sequence SEQ ID NO:3054. 


1976 


98 


56 


gil5524592 


Homo sapiens 


unnamed protein product 


1033 


52 


56 


gi537514 


Homo sapiens 


Human arylacetamide deacetylase 
mRNA, complete cds. ' 


1033 


52 


56 


AAB54079 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:531. 


1017 


51 


57 


AAB33831 


Homo sapiens 


Human secreted protein BLAST search 
protein SEQ ID NO: 175. 


149 


35 


57 


gil 109682 


Bos taums 


G-protein gamma-12 subunit 


149 


35 


57 


AAW09416 


Homo sapiens 


Human G protein gamma-7 subunit 


144 


33 


58 


gil2082750 


Mus musculus 


T-box transcription factor TBX20 


1469 


93 


58 


gi9909810 


Mus musculus 


T-box transcription factor 


1469 


93 


58 


gi7229717 


Danio rerio 


H15-related T-box transcription factor 
hrT 


1346 


85 


59 


gi4185946 


Human 
endogenous 
retrovirus K 


gag protein 


146 


26 


59 


gi5802821 


Homo sapiens 


endogenous retrovirus HERV-K108, 


146 


26 
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complete sequence. 






59 


gi5802814 


Homo sapiens 


endogenous retrovirus HERV-K103, 
complete sequence. 


146 


26 


60 


AAB94756 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15815. 


126 


42 


60 


gi332612 


Gibbon ape 
leukemia vims 


pol polyprotein 


113 


50 


60 


gi3I33302 


Sus scrota 


pol protein 


110 


53 


61 


gil0121625 


Gillichthys 
mirabilis 


60S acidic ribosomal protein P 1 


127 


81 


61 


AAB44012 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1457. 


125 


78 


61 


AAB43434 


Homo sapiens 


Human cancer associated protein 
seauence SEO TD NO -879 


125 


78 


62 


AAB12585 


Homo saniens 


WiimjiTi T 1 cjA\ nptivaHn a nrrifpin 

milium J. alsUVclVUlg piuiviu OJDV^ 

IDNO:4. 


140 


XI 


62 


gi 12805221 


Miis musculus 


IvrrmHncvte anriupn pnrrmlf** 






62 


gil98924 


Mus musculus 


Ly-6A.2 


140 


37 


63 


ei6969165 


Homo ^anipnQ 


nuiliflll LJlrirx dCljUCJUvC LIU ill L1UU.C 

RP3-475N16 on chromosome 6pl2.3- . 
212 Contains the penes for CTG4A 

•>l»*>i Wl II M Ilia UXw KvUwi] 1XJX V* A XJ^X*j 

nre-T cell recentor aloha, a novel 
protein similar to RPL7 A (60S 
ribosomal protein L7A) and the 3' end 
of gene KIAA0240. Contains ESTs, 
STSs, GSSs and four putative CpG 
islands, complete sequence. 




o/ 


63 




Mus musculus 


putative 


512 


59 


63 


gil5293877 


Ictalurus 
punctatus 


ribosomal protein L7 


314 


38 


64 


gil81573 


Homo sapiens 


Human cytokeratin 8 (CK8) gene, 
complete cds. 


1147 


79 


64 


gil81400 


Homo sapiens 


Human cytokeratin 8 mRNA, complete 
cds. 


1147 


78 


64 


gi400416 


Homo sapiens 


H.sapiens KRT8 mRNA for keratin 8. 


1147 


79 


65 


gil3620887 


Mus musculus 


mitochondrial ribosomal protein S6 


633 


100 


65 


gil3620885 


Homo sapiens 


MRPS6 mRNA for rmtochondrial 
ribosomal protein S6, partial cds. 


565 


85 


65 


gil4603226 


Homo sapiens 


clone MGO.19576 IMAGE:4304420, 
mRNA, complete cds. 


565 


85 


66 


gil3537119 


Homo sapiens 


mRNA for PAR-6 gamma, complete 
cds. 


1956 


100 


66 


gi8037909 


Mus museums 


PAR6A 


1490 


76 


66 


gi9453884 


Homo sapiens 


mRNA for 16-5-5, partial cds. 


1304 


93 


67 


AAB95293 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17517. 


776 


79 


67 


AAG81270 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:58. 


776 


79 


67 


gil4035848 


Homo sapiens 


unnamed protein product 


776 


79 


68 


gi7020759 


Homo sapiens 


cDNA FLF20565 fis, clone REC00542. 


930 


60 


68 


gil5216181 


Homo sapiens 


mRNA for putative 67-1 1-3 protein. 


927 


60 


68 


gil5930069 


Homo sapiens 


Similar to hypothetical protein 
FU20565, clone MGC:8850 


917 


60 
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IMAGE:39 14396, mRNA, complete 
cds. 






69 


gi3228237 


Homo sapiens 


UHS KerB gene. 


810 


72 


69 


gi200962 


Mus musculus 


serine 1 ultra high sulfur protein 


755 


69 


69 


gi32472 


Homo sapiens 


H. sapiens mRNA for high-sulphur 
keratin. 


749 


71 


70 


AAB92789 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 11284. 


3518 


100 


70 


gi7022420 


Homo sapiens 


cDNA FLJ10407 fis, clone 
NT2RM4000520. 


3518 


100 


70 


gil3111786 


Homo sapiens 


hypothetical protein FU10407, clone 
MGC:970 IMAGE:3 509727, mRNA, 
complete cds. 


3511 


99 


71 


gil3325178 


Homo sapiens 


Similar to RIKEN cDNA 2210016F16 
gene, clone MGC: 10999 
IMAGE:3638524, mRNA, complete 
cds. 


856 


100 


71 


gi7291278 


Drosophila 
melanogaster 


CG9752 gene product 


744 


43 


71 


gi2854153 


Caenorhabditis 
elegans 


Hypothetical protein CI 1D2.4 


729 


45 


72 


gi7020991 


Homo sapiens 


cDNA FLJ20718 fis, clone HEP17872. 


3013 


100 


72 


gil5680144, 


Homo sapiens 


hypothetical protein FU20718, clone 
IMAGE:4577269, mRNA, partial cds. 


2906 


99 


72 


gil0801646 


Macaca 
fascicularis 


hypothetical protein 


1097 


99 


73 


AAG93290 


Homo sapiens 


Human protein HP 10650. 


1215 


100 


73 


gil4587195 


Homo sapiens 


FAPPl-associated protein 1 (FASP1) 
mRNA, complete cds. 


1215 


100 


73 


gi8H8225 


Homo sapiens 


chromosome 21 unknown mRNA. 


1215 


100 


74 


gil0436998 


Homo sapiens 


cDNA:FU21011 fis, clone 
CAE04289. 


2522 


100 


74 


gil5030282 


Homo sapiens 


clone MGC:16827 IMAGE:3855873, 
mRNA, complete cds. 


2522 


100 


74 


gi8570641 


Homo sapiens 


clone 133K02 unknown mRNA. 


2514 


99 


75 


gi6599255 


Homo sapiens 


mRNA; cDNA DKFZp434C0328 
(from clone DKFZp434C0328). 


1612 


100 


75 


gi6330416 


Homo sapiens 


mRNA for KIAA1201 protein, partial 
cds. 


554 


38 


75 


AAB74726 


Homo sapiens 


Human membrane associated protein 
MEMAP-32. 


496 


35 


76 


gi7021059 


Homo sapiens 


cDNA FLF20758 fis, clone HEP01508. 


1450 


100 


76 


AAW88552 


Homo sapiens 


Secreted protein encoded by gene 19 
clone HSAVU34. 


1429 


100 


76 


gil5341707 


Homo sapiens 


clone MGC: 19979 IMAGE:3939273, 
mRNA, complete cds. 


1429 


100 


77 


AAB95410 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17796. 


774 


100 


77 


gil0435394 


Homo sapiens 


CDNAFU13391 fis, clone 
PLACE1001241. 


774 


100 


77 


gil0503974 


Homo sapiens 


clone SP24 unknown mRNA. 


765 


99 


78 


gi7020587 


Homo sapiens 


cDNA FU20467 fis, clone KAT06638. 


737 


100 
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78 


AAB42883 


Homo sapiens 


Human ORFX ORF2647 polypeptide 
sequence SEQ ID NO:5294. 


530 


100 


78 


AAB56642 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1220. 


530 


100 


79 


AAW93948 


Homo sapiens 


Human regulatory molecule HRM-4 
protein. 


441 


91 


79 


gil2852696 


Mus musculus 


putative 


386 


47 


79 


gil2751103 


Homo sapiens 


PNAS-129 mRNA, complete cds. 


348 


100 


80 


gi7243053 


Homo sapiens 


mRNA for KIAA1336 protein, partial 
cds. 


3851 


99 


80 


gi7292144 


Drosophila 
melanogaster 


CG2069 gene product 


1634 


44 


80 


gil065457 


Caenorhabditis 
elegans 


C54G7.4 gene product 


706 


25 


81 


gil0439581 


Homo sapiens 


cDNA: FLJ23023 fis, clone 
LNG01678. 


652 


100 


81 


gi7021132 


Homo sapiens 


CDNAFLJ20813 fis, clone 
ADSE01247. 


652 


100 


81 


AAG74674 


Homo sapiens 


Human colon cancer antigen protein 
SEO ID N05438 


556 


92 


82 


ei5262611 


Homo sanien^ 


mRNA' cDNA DKFZo434Il 14 ffrotn 
clone DKFZp434Il 14); complete cds. 




mo 


82 


eil 1493368 


Homo sanien^ 


FTiimnn HMA cwiiipnr#» from nlnrn* 
iiumau lyi'iii ovi^Ububv uuiu uuuc 

RP5-1009E24 on chromosome 20 
Contains the SN gene encoding 
sialoadhesin, a novel gene similar to 
KIAA0417, the CENPB gene for 
centromere protein B, the CDC25B 
gene for Cell division cycle protein 
25B, three novel genes, the 5* end of 
gene KIAA1271, nine CpG islands, 
ESTs, STSs and GSSs, complete 
sequence. 


AIR 

OJO 


inn 


82 


gil3543798 


Mus musculus 


RKEN cDNA 4931426X16 gene 


680 


92 


83 


AAB57003 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1581. 


1302 


99 


83 


AAR60558 


Homo sapiens 


Human basigin I. 


1302 


99 


83 

* 


gi3492872 


Homo sapiens 


chromosome 19, cosmid F18382 
(LLNLF-140D2) and 3' overlapping 
restriction fragment, complete 
sequence. 


1302 


99 


84 


gi9187614 


Homo sapiens 


mRNA full length insert cDN A clone 
EUROIMAGE 1759349. 


580 


100 


84 


AAB01394 


Homo sapiens 


Neuron-associated protein. 


70 


39 


84 


AAB54358 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:810. 


70 


39 


85 


gil5986445 


Homo sapiens 


p90 autoantigen mRNA, complete cds. 


4513 


99 


85 


gi7959315 


Homo sapiens 


mRNA for KIAA1524 protein, partial 
cds. 


4357 


99 


85 


AAB95207 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17311. 


2341 


100 


86 


gi7959231 


Homo sapiens 


mRNA for KIAA1485 protein, partial 
cds. 


5813 


99 
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86 


AAB40418 


Homo sapiens 


Human ORFX ORF182 polypeptide 
sequence SEQ ED NO:364. 


708 


99 


86 


gi5901529 


Homo sapiens 


C2H2 type Kmpp el-like zinc finger 
protein splice variant b (ZNF236) 
mRNA, complete cds. 


520 


24 


87 


gi7243270 


Homo sapiens 


mRNA for KIAA1436 protein, partial 
cds. 


4604 


99 


87 


gi5051974 


Mus musculus 


F2 alpha prostoglandin regulatory 
protein 


4195 


89 


87 


gil054884 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory 
protein precursor 


4191 


88 


88 


gi!3241286 


Mus musculus 


GAB A(A) receptor-associated protein- 
like2 


607 


100 


88 


gi2104570 


Rattus 
norvegicus 


GEF-2 


607 


100 


88 


gi4433387 


Bos taurus 


general protein transport factor pi 6 


607 


100 


89 


gil5859535 


Homo sapiens 


unnamed protein product 


5935 


99 


89 


gi3043606 


Homo sapiens 


mRNA for KIAA0541 protein, partial 
cds. 


5890 


100 


89 


gil5624075 


Homo sapiens 


TGF-beta resistance-associated protein 
TRAG (TRAG) mRNA, partial cds. 


5719 


96 


90 


gi337370 


Homo sapiens 


Human rapamycin- and FK5 0 6-binding 
protein, complete cds. 


740 


100 


90 


gil3097252 


Homo sapiens 


Similar to FK506 binding protein 2 (13 
kDa), clone MGC5177 
MAGE:3445148, mRNA, complete 
cds. 


740 


100 


90 


AAQ31004 aa 
1 


Homo sapiens 


hRFKBP cDNA. 


735 


99 


91 


gil2053147 


Homo sapiens 


mRNA; cDNA DKFZp434F1726 (from 
clone DKFZp434F1726). 


1450 


100 


91 


gi412195 


Homo sapiens 


unknown 


265 


98 


91 


AAR04931 


Homo sapiens 


Interferon-gamma receptor segment 
from clone 39 responsiblefor binding 
me target 


260 


96 


92 


gil0437948 


Homo sapiens 


cDNA: FLJ21783 fis, clone HEP00284. 


3276 


100 


92 


AAB95352 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17643. 


1953 


99 


92 


gil0435077 


Homo sapiens 


cDNA FLJ13171 fis, clone 
NT2RP3003819. 


1953 


99 


93 


gil2803319, 


Homo sapiens 


clone MGC:3090 IMAGE:3347913, 
mRNA, complete cds. 


4837 


99 


93 


gil4044064 


Homo sapiens 


hypothetical protein DKFZp762Ml 15, 
clone MGC: 14418 IMAGE:4302613, 
mRNA, complete cds. 


4831 


99 


93 


gil0047337 


Homo sapiens 


mRNA for KIAA1630 protein, partial 
cds. 


4671 


100 


94 


AAB70535 


Homo sapiens 


Human PR05 protein sequence SEQ 
ID NO: 10. 


2979 


100 


94 


gil3185719 


Homo sapiens 


unnamed protein product 


2979 


100 


94 


AAB94106 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14334. 


2334 


100 


95 


gil2837873 


Mus musculus 


putative 


2370 


75 
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% 
TfffttiHtv 


95 


gil3 195574 


Mus musculus 


Prajal isofonna 


2339 


75 


yo 




numu bapieiib 


flnman nrntPin cpniipncp RPO III 

riuiiioii proicin bcquciiue ocy xu 
NO:l3691. 


1041 


yy 






nomo Sapiens 


Human tnPKTA few XC1 A A (Y\(\ 1 opnp 

riuirutn iniviNrv iur rkJi/vrvu^vx gene, 
partial cds. 


lUUZO 


inn 


ok 




rLomo Sapiens 


xiuman uiNiv sequence irom cione 
RP1-12208 on chromosome 6ql4.2- 
16.1. Contains the 3' part of a novel 
gene partially coded for by KIAA0301, 
a novel gene and the 3* part of the gene 
KIAA0957. Contains ESTs, STSs, 
GSSs and a putative CpG island, 
complete sequence. 


1 (\fklfl 

JLUOZO 


inn 


96 


gil0727627 


Drosophila 
melanogaster 


CG13185 gene product 


1452 


34 




A A DMO 1 O 

AAB8231o 


Homo sapiens 


Human immunoglobulin receptor 
IRTA5 protein. 


2235 


1 AT* 
100 


97 


gil552o831 


Homo sapiens 


rc receptor- like protein 1 (FCRrll) 
mRNA, complete cds. 


hoc 
2235 


1 AA 
100 


5/7 


gi9y30y21 


Homo sapiens 


Human DNA sequence from clone 
RP11-367J7 on chromosome 1. 
Contains (part of) two or more genes 
for novel Immunoglobulin domains 
containing proteins, a SON DNA 
binding protein (SON) pseudogene, a 
voltage-dependent anion channel 1 
(VDAC1) (plasmalemmal porin) 
pseudogene, ESTs, STSs and GSSs, 
complete sequence. 


15J3 


1 AA 
100 


08 


AAIdoZoIo 


Homo sapiens 


Human immunoglobulin receptor 
IRTA5^rotein. 


Zl / / 


OQ 

yo 


JJo 




Homo sapiens 


rc receptor-nice protein i \jrKA\xii ) 
mRNA, complete cds. 


Zll 1 


no 


Qft 
70 




xiomo sapiens 


Human DNA sequence from clone 
Kri i-jo /j / on cniomosome i . 
Contains (part of) two or more genes 

firw tirtVpl TuntmitiAtrl aKhIiti rl nm ainc 
jlui iiuvci iiuiiiuuugiuuuiiii mjiimmn 

containing proteins, a SON DNA 

ViinHincr rimfpin ^SDM^ nc/^uHncrpn^ n 

voltage-dependent anion channel 1 
(VDAC1) (plasmalemmal porin) 
pseudogene, ESTs, STSs and GSSs, 
complete sequence. 




1 AA 
100 


99 


gil0438861 


Homo sapiens 


cDNA: FLJ22461 fis, clone 
HRC10107. 


4904 


100 


99 


gi!5079400 


Homo sapiens 


clone MGC: 16796 MAGE:3855477, 
mRNA, complete cds. 


4899 


99 


99 


AAU03497 


Homo sapiens 


Human sterol sensing domain protein. 


4047 


99 


100 


gi6524024 


Mus musculus 


mammalian inositol hexakisphosphate 
kinase 1 


1031 


50 


100 


gil0280996 


Rattus 
norvegicus 




1027 


49 


100 


©6683115 


Homo sapiens 


mRNA for KIAA0263 protein, partial 


1021 


49 
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cds. 






101 


gi6524024 


Mus musculus 


mammalian inositol hexakisphosphate 
kinase 1 


1037 


51 


101 


gil0280996 


Rattus 
norvegicus 


inositol hexakisphosphate kinase 


1033 


50 


101 


gi6683115 


Homo sapiens 


mRNA for KIAA0263 protein, partial 
cds. 


1027 


50 


102 


gil3623311 


Homo sapiens 


clone IMAGE:3948563, mRNA, 
partial cds. 


1629 


100 


102 


gi3135968 


Homo sapiens 


Human DNA sequence from clone 
XXbac-3418 on chromosome 6p21.3- 
22.1. Contains the 5* end of the 
ZNF184 gene for Kiuppel-like zinc 
finger protein 184, a heterogeneous 
nuclear ribonucleoprotein Al 
(HNRPA1) pseudogene, a CD83 
antigen pseudogene, ESTs, STSs, GSSs 
and three CpG islands, complete 
sequence. 


1627 


47 


102 


gil769491 


Homo sapiens 


Human kruppel-related zinc finger 
protein (ZNF184) mRNA, partial cds. 


1625 


47 


103 


gil6198398 


Homo sapiens 


clone MGC27353 IMAGE:4671816, 
mRNA, complete cds. 


2606 


85 


103 


gi829151 


Homo sapiens 


Hsapiens ZNF37A mRNA for zinc 
finger protein, 


1371 


99 


103 


gi9801232 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-508N22 on chromosome 10 
Contains part of a novel gene 
(HSPC025), part of the ZNF37A (zinc 
finger protein 37a (KOX 21)) gene, 
part of a putative novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


1371 


99 


104 


gil2053123 


Homo sapiens 


mRNA; cDNA DKFZp434K1421 
(from clone DKFZfc434K1421); 
complete cds. 


2624 


100 


104 


gi7292866 


Drosophila 
melanogaster 


CG15747 gene product 


362 


31 


104 


gi7549210 


Babesia 
bigemina 


200 kDa antigen p200 


298 


21 


105 


gil2053123 


Homo sapiens 


mRNA; cDNA DKFZp434K1421 
(from clone DKFZ*>434K1421); 
complete cds. 


2898 


100 


105 


gi6841130 


Homo sapiens 


HSPC095 mRNA, partial cds. 


419 


100 


105 


gi7292866 


Drosophila 
melanogaster 


CG15747 gene product 


364 


30 


106 


&10438207 


Homo sapiens 


cDNA: FU21977 fis, clone HEP05976. 


1978 


99 


106 


gil5012167 


Homo sapiens 


hypothetical protein FU21977, clone 
MGC:14918 1MAGE:3936410, mRNA, 
complete cds. 


1974 


99 


106 


AAB42499 


Homo sapiens 


Human ORFX ORF2263 polypeptide 
sequence SEQ ID NO:4526. 


1392 


100 


107 


gil228035 


Homo sapiens 


Human mRNA for KIAA0191 gene, 


8020 


99 
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partial cds. 






107 


gil2697967 


Homo sapiens 


mRNA for KIAA171 1 protein, partial 
cds. 


1593 


58 


107 


AAB94636 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15515. 


1004 


52 


108 


AAG81252 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:22. 


2146 


99 


108 


gil4035812 


Homo sapiens 


unnamed protein product 


2146 


99 


108 


gil0440123 


Homo sapiens 


cDNA: FU23436 fis, clone 
HRC12692. 


2054 


100 


109 


gi200009 


Mus musculus 


myosin I 


5386 


96 


109 


gil666471 


Mus musculus 


myosin I heavy chain 


5360 


94 


109 


gi56733 


Rattus 
norvegicus 


myosin I heavy chain 


5268 


91 


110 


gil2053045 


Homo sapiens 


mRNA; cDNA DKFZp434K1115 
(from clone DKFZp434Kl 1 15); 
complete cds. 


4840 


100 


110 


AAB65631 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 158. 


4835 


99 


110 


gil4133215 


Homo sapiens 


mRNA for KIAA078 1 protein, partial 
cds. 


4678 


100 j 


111 


gil2642596 


Homo sapiens 


nuclear receptor co-repressor/HD AC3 
complex sub-unit TBLR1 (TBLR1) 
mRNA, complete cds. 


2725 


100 


111 


AAB95225 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17352. 


2720 


99 


111 


gil0434648 


Homo sapiens 


cDNA FU12894 fis, clone 
NT2RP2004170, moderately similar to 
Homo sapiens mRNA for transducin 
(beta) like 1 protein. 


2720 


99 


112 


gi2224557 


Homo sapiens 


Human mRNA for KIAA0308 gene, 
partial cds. 


6666 


99 


112 


AAY23330 


Homo sapiens 


Human tumour suppressor (kismet) 
protein. 


5759 


98 


112 


gi7243213 


Homo sapiens 


mRNA for KIAA1416 protein, partial 
cds. 


5264 


59 


113 


gil2856019 


Mus musculus 


putative 


1527 


95 ! 


113 


gi3947604 


Caenorhabditis 
elegans 


cDNA EST ykl29fl.3 comes from tins 
gene-cDNA EST ykl29fl.5 comes 
from this gene-cDNA EST yk203e4.3 
comes from this gene~cDNA EST 
ykl91a9.3 comes from this 
gene-cDNA EST yk262cl0.3 comes 
from this gene-cDNA EST yk278f9.3 
comes from this gene—cDNA EST 
yk325c7.3 comes from this 
gene~cDNA EST yk337fl.3 comes 
from tiiis gene-cDNA EST yk449a23 
conies from this gene~cDNA EST 
yk203e4.5 conies from this 
gene~cDNA EST ykl91a9.5 comes 
from this gene-cDNA EST yk278©.5 
comes from this gene-cDNA EST 
yk262c!0.5 comes from this 


787 


41 
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gene-cDNA EST yk325c7.5 comes 
from this gene-cDNA EST yk337fl .5 
comes from this gene~cDNA EST 
yk448gl0.5 comes from this 
gene-cDNA EST yk449a2.5 comes 
from this gene-cDNA EST yk636e2.3 
conies from mis gene-cDNA EST 
yk636e2.5 comes from this 
gene-cDNA EST yk550e8.3 comes 
from this gene-cDNA EST yk557a9.3 

comes frnm fhiQ ffpnf»~pD^A T?^T 

yk579cl2.3 comes from this 
gene-cDNA EST yk614e7.3 comes 
from this gene-cDNA EST yk653fl.3 
comes from this gene~cDNA EST 
yk672b2.3 comes from this 
gene-cDNA EST yk550e8.5 comes 
from this gene-cDNA EST yk556bl.5 
comes from this gene-cDNA EST 
yk557a9.5 comes from this 
gene-cDNA EST yk579cl2.5 comes 

from this aftn e-r.DN A KST vlrfiOfirR ^ 

comes from this gene-cDNA EST 
yk614e7.5 comes from this gene 






113 


gi3947603 

• 


Caenorhabditis 
elegans 


cDNA EST ykl67h7.3 comes from this 
gene-cDNA EST ykl67h7.5 comes 
from mis gene-cDNA EST yk289g5.3 
conies from mis gene^-cDNA EST 
yk332h9.3 comes from mis 
gene-cDNA EST yk289g5.5 comes 
from this gene-cDNA EST yk332h9.5 
comes from this gene-cDNA EST 
yk391h4.5 comes from mis 
gene~cDNAESTyk653fl.5 comes 
from this gene 


787 


41 


114 


gi9280136 


Macaca 
fascicularis 


unnamed protein product 


3431 


95 


114 


gi4262617 


Caenorhabditis 
elegans 


contains similarity to dual specificity 
phosphatase, catalyitic domain 
(Pfam:PF00782, Score=16.8, E=7.4e- 
05,N=1) 


470 


35 


114 


gi5706724 


Homo sapiens 


Cdcl4B3 phosphatase mRNA, 
complete cds. 


166 


30 


115 


AAB95254 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17423. 


3114 


99 


115 


gil4042385 


Homo sapiens 


cDNA FLJ14693 fis, clone 
NT2RP2005360, weakly similar to 
Homo sapiens sentrin/SUMO-specific 
protease (SENP1) mRNA. 


3114 


99 


115 


gil03 14023 


Homo sapiens 


sentrin-specific protease (SENP2) 
mRNA, complete cds. 


3107 


99 


116 


gi4240227 


Homo sapiens 


mRNA for KIAA0869 protein, partial 
cds. 


4417 


98 
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116 


gil3879506 


Mus musculus 


Unknown (protein for 
IMAGE:3963643) 


4063 


89 


116 


AAB93267 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12300. 


1895 


97 


117 


gil3235092 


Homo sapiens 


mRNA for testis specific protein A14 
(TSGA14 gene). 


1957 


100 


117 


gil0438839 


Homo sapiens 


cDNA:FLJ22445 fis, clone 
HRC09438. 


1950 


99 


117 


gil3235344 . 


Mus musculus 


testis specific protein a 14 


1704 


87 


118 


gi7959279 


Homo saniens 


mRNA fnr KT A A 1 SOQ nrntpin rmrtial 

cds. 


o /oy 


00 


118 


AAB94101 


Hotno sarrifins 


Hnmnn nrntp* in cpmim^p QPH FPi 

NO- 14322 


lo/i 


on 


118 


gi!0434073 


Homo sapiens 


CDNAFLJ12531 fis, clone 
NT2RM4000 1 99 


1871 


99 


119 


AAM00936 


Homo saoiens 


Human honf* marrnw rvrnteiri ^IPO TTi 

NO: 412. 


^**sn 


inn 


119 


AAB42828 


Homo sapiens 


Human ORFX ORF2S92 nnlvn^ntinV 
sequence SEQ ID NO:5184. 




inn 


119 

i 


gi9557949 


Homo sapiens 


mRNA for hvnothpti'cal nrnfrin 

(ORFH clone 

Telemon(Italy B41) Strait02270 FL1 
42. 




inn 


120 


AAB11082 


Homo sapiens 


Human secreted protein ZALPHA13 
protein. 


2783 


93 


120 


gil 1230043 


Homo sapiens 


unnamed protein product 


2783 


93 


120 


AAB37988 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HDPAS92. 


2747 


93 


121 


gil2852526 


Mus musculus 


putative 


1689 


80 


121 


AAB41765 


Homo sapiens 


Human ORFX ORF1529 polypeptide 
sequence SEQ ID NO:3058. 


1576 


100 


121 


gi4406663 


Homo sapiens 


clone 24945 mRNA sequence, partial 
cds. 


1576 


100 


122 


AAR22958 


Homo sapiens 


Human proteasome component HC5. 


1010 


85 


122 




noino sapiens 


Human mRNA for proteasome subunit 
HC5. 


1010 


85 


122 




nomo sapiens 


xiuman jjjn a sequence trom clone 
RPM91N21 on chromosome 6q27. 
contains a / transmembrane receptor 
(rhodopsin family) (olfactory receptor 
like) pseudogene, the PDCD2 gene for 
programmed cell death 2 (RP8 
homolog), the TBP gene for TATA box 
binding protein, the gene for 
proteasome subunit HC5, ESTs, STSs 
and GSSs, complete sequence. 


1010 


85 


123 


AAB21027 


Homo sapiens 


Human nucleic acid-binding protein, 
NuABP-31. 


1456 


100 


123 


AAB45146 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 27 SEQ ID NO:87. 


1456 


100 


123 


gi4884258 


Homo sapiens 


mRNA; cDNA DKFZp564O092 (from 
clone DKFZp564O092); partial cds. 


1430 


100 


124 


gil3325436 


Homo sapiens 


Similar to RIKENcDNA 


1394 


100 1 



148 
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C330013D18 gene, clone MGC:11226 
IMAGE:3937599, mRNA, complete 
cds. 






124 


gil3559363 


Homo sapiens 


MRPL9 mRNA for mitochondrial 
ribosomal protein L9 (L9mt), complete 
cds. 


1388 


99 


124 


AAG93251 


Homo saniens 


Human nrotein HP02612 


1153 


86 


125 


AAB85507 


Homo sapiens 


Human protein kinase SGK164. 


2949 


100 


125 


2il3543922 


Hnmrt qo nip tic 


Similar tn RTTCFM rTttJA SA^fMlfiAfK 

gene, clone MGC: 12903 
cds. 




inn 


125 


ei 12856491 


ivxuo UlUak/UlUa 


puutuvc 




70 


126 


gil2653817 


Homo sapiens 


Similar to Male-specific RNA 84Dd, 

vIUXIC IVIAjO.jv;^ Jlvl/Vljll. j j^-yjoj, 

mRNA, complete cds. 


3399 


100 


126 


AAB94115 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14356. 


3392 


99 


126 


gil0434102 . 


Homo sapiens 


cDNA FLJ12549 fis, clone 
TJT9 p Mdnnn/y? o 


3392 


99 


127 


gi7243187 


Homo sapiens 


mRNA for KIAA1403 protein, partial 
cds. 


6448 


98 


127 


gil2652971 


Homo sapiens 


clone MGC:858 IMAGE:3357380, 

ULtviNrv, complete COS. 


3992 


100 


127 


AAB92872 


Homo sapiens 


Human protein sequence SEQ ID 


3987 


99 


128 


AAB94324 


Homo sapiens 


Human protein sequence SEQ ID 
NO- 14807 


1779 


99 


128 


gil0434528 


Homo sapiens 


cDNA FLJ12816 fis, clone 

NT7RP?002finQ wpaVIv similar tn 9. 

HYDROXYMUCONIC 
SEMIALDEHYDE HYDROr ASE fl?C 
3.1.1.-). ! 


1779 


99 


128 


AAB42143 


Homo c aniens 


Human ORFX ORF1 907 nnlvnentide 
sequence SEQ ID NO:3814. 


1 591 


inn 


129 


»6329945 


Homo ^aniens 


mRNA for KTAA1 140 nrntpin narKal 

cds. 


1RV7 
loJ / 




129 


gil2805043 


Homo sapiens 


clone IMAGE:3461487, mRNA, 
partial cds. 


1279 


54 


129 


gi7302173 


Drosophila 
melanogaster 


BcDNA:LD2 1719 gene product 


1261 


35 


130 


AAB28199 


Homo sapiens 


Human HMG-17 non histone 
chromosomal protein. 


322 


75 


130 


gi306864 


Homo sapiens 


Human non-histone chromosomal 
protein HMG-17 mRNA, complete cds. 


322 


75 


130 


gi32329 


Homo sapiens 


Human HMG-17 gene for non-histone 
chromosomal protein HMG-17. 


322 


75 


131 


gil6041794 


Homo sapiens 


clone MGC:23591 EVtAGE:4856946, 
mRNA, complete cds. 


2714 


99 


131 


gil5559462 


Homo sapiens 


Similar to old astrocyte specifically 
induced substance, clone MGC:20215 
IMAGE:4546950, mRNA, complete 
cds. 


2709 


99 



149 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQ ID NO: 


Accession No- 


Species 


Description 


Score 


% 
Identity 


131 


gi45 19621 


Mus musculus 


OASIS protein 


2406 


91 


132 


gi7573591 


Homo sapiens 


Human DNA sequence from clone 
RP1-309K20 on chromosome 20 
Contains the gene for a novel protein 
similar to dysferlin, the SPAG4 gene 
for sperm associated antigen 4, the 
CPNE1 gene for Copine I (similar to 
KIAA0636), the gene KIAA0765 
(HRIHFB2091) for an RNA 
recognition motif (RNP, RRM or RBD 
domain) containing protein and the 3* 
end of the NEFS gene for cysteine 
desulfurase. Contains ESTs, STSs, 
GSSs and four putative CpG islands, 
complete sequence. 


4972 


100 


132 


gil5559252 


Homo sapiens 


RNA binding motif protein 12, clone 
MGQ19528 IMAGE:3845090, mRNA, 
complete cds. 


4972 


100 


132 


gil5215375 


Homo sapiens 


RNA binding motif protein 12, clone 
MGQ16487 IMAGE:3956772, mRNA, 
complete cds. 


4972 


100 


133 


gil2697774 


Mus musculus 


acetyl-Co A synthetase 2 


3181 


87 | 


133 


gil2697772 


Bos taurus 


acetyl-CoA synthetase 2 


3056 


83 


133 


AAB34712 


Homo sapiens 


Human secreted protein encoded by 
DNA clone vo9 1. 


2721 


100 


134 


gi7020783 


Homo sapiens 


cDNA FLJ20580 fis, clone REC00516. 


848 


100 


134 


gil5012026 


Homo sapiens 


Similar to hypothetical protein 
FLJ20580, clone MGC:13430 
IMAGE:4093763, mRNA, complete 
cds. 


848 


100 


134 


gil2833008 


Mus musculus 


putative 


814 


85 


135 


AAB94473 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15139. 


1970 


100 


135 


AAG74880 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5644. 


1970 


100 


135 


AAB43720 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1 165. 


1970 


100 


136 


gil0047285 


Homo sapiens 


mRNA for KIAA1605 protein, partial 
cds. 


3610 


99 


136 


gil6215453 


Homo sapiens 


mRNA for bile acid beta-glucosidase. 


3610 


99 


136 


gil5030210 


Homo sapiens 


K1AA1605 protein, clone MGC:16895 
IMAGE:4339156, mRNA, complete 
cds. 


3610 


99 


137 


gi4914601 


Homo sapiens 


mRNA; cDNA DKF2£564A026 (from 
clone DKFZp564A026). 


4171 


94 


137 


AAB94357 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14881. 


2195 


99 


137 


AAY45161 


Homo sapiens 


Human secreted protein clone 
CO 139 3 protein sequence. 


2112 


100 


138 


gBI3131 


Torpedo 
mannorata 


alpha-tubulin 


1192 


97 


138 


gil4198110 


Mus musculus 


tubulin alpha 1 


1192 


97 


138 


gil3435777 


Mus musculus 


tubulin alpha 6 


1192 


97 
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139 


AAB94856 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16044. 


2138 


100 


139 


AAB94628 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15490. 


2138 . 


100 


139 


gil0436294 


Homo sapiens 


cDNAFU13970 fis, clone 
Y79AA1001533, moderately similar to 
Mouse mRNA for RN A polymerase I 
associated factor (PAF53). 


2138 


100 


140 


gi2642187 


Rattus 
norvegicus 


endo-alpha-D-mannosidase 


1415 


67 


140 


AAB95204 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17303. 


1094 


66 


140 


gil0434559 


Homo sapiens 


cDNA FIJI 2838 fis, clone 
NT2RP2003230, moderately similar to 
Rattus norvegicus endo-alpha-D- 
mannosidase (Enman) mRNA. 


1094 


66 


141 


gi3449308 


Homo sapiens 


mRNA for MEGF8, partial cds. 


9785 


100 


141 


gi6681364 


Rattus 
norvegicus 


MEGF8 


4772 


95 


141 


gil0728654 


Drosophila 
melanogaster 


CG7466 gene product 


2902 


34 


142 


AAY29517 


Homo sapiens 


Human lung tumour protein SAL-82 
predicted amino acid sequence. 


3048 


100 


142 


gil3958036 


Homo sapiens 


FYVE-finger protein EIP 1 mRNA, 
complete cds. 


3048 


100 


142 


AAY29861 


Homo sapiens 


Human secreted protein clone cb98 4. 


3041 


99 


143 


gil4718539 


Homo sapiens 


HIC-3 mRNA, complete cds. 


3178 


99 1 


143 


gi5689371 


Homo sapiens 


mRNA for KIAA1020 protein, partial 
cds. 


2970 


99 


143 


gi7328028 


Homo sapiens 


mRNA; cDNA DKFZp434F0616 (from 
clone DKFZp434F0616); partial cds. 


1738 


100 


144 


gil2620400 


Homo sapiens 


mitochondrial carrier protein CGI-69 
long form mRNA, complete cds. 


1856 


99 


144 


AAB42783 


Homo sapiens 


Human ORFX ORF2547 polypeptide 
sequence SEQ ID NO:5094. 


1804 


96 


144 


gil0438783 


Homo sapiens 


cDNA: FU22407 fis, clone 
HRC08407. 


1798 


97 


145 


gi2792366 


Homo sapiens 


unknown protein IT12 mRNA, partial 
cds. 


4390 


99 


145 


gi!843399 


Homo sapiens 


mRNA, partial cds, clone:RES4-25. 


3676 


99 


145 


gil4602505 


Homo sapiens 


clone IMAGE:3936655, mRNA, 
partial cds. 


2366 


99 


146 


gil3359167 


Homo sapiens 


mRNA for KIAA1646 protein, partial 
cds. 


2581 


99 


146 


AAY96059 


Homo sapiens 


Human sphingosine kinase C. 


2456 


99 


146 


gi6572330 


Homo sapiens 


Human DNA sequence from clone 
59H18 on chromosome 22. Contains 
the y part of the gene for KIAA0767, a 
novel gene, ESTs, STSs, GSSs and a 
putative CpG island, complete 
sequence. 


1627 


96 


147 


gil4043303 


Homo sapiens 


exonuclease NEF-sp, clone 
MGC.15944 IMAGE:3537866, mRNA, 


4043 


100 
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complete cds. 






147 


gil3272524 


Homo sapiens 


exonuclease NEF-sp mRNA, complete 
cds. 


4039 


99 


147 


gil2053043 


Homo sapiens 


mRNA; cDNA DKFZp434J0315 (from 
clone DKFZp434J0315); complete cds. 


3843 


95 


148 


gi7243037 


Homo sapiens 


mRNA for KIAA1328 protein, partial 
cds. 


2894 


100 


148 


gil3874541 


Macaca 
fascicularis 


hypothetical protein 


2492 


93 


148 


gil335313 


Homo sapiens 


Human muscle mRNA for embryonic 
myosin heavy chain (SMHCE). 


129 


24 


149 


AAB42399 


Homo sapiens 


Human ORFX ORF2 1 63 polypeptide 
sequence SEQ ID NO:4326. 


1362 


91 


149 


AAB42366 


Homo sapiens 


Human ORFX ORF2130 polypeptide 
sequence SEQ ID NO:4260. 


626 


100 


149 


gi7298594 


Drosophila 
melanogaster 


CG10189 gene product 


223 


35 


150 


AAB95372 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17692. 


1538 


99 


150 


gil0435150 


Homo sapiens 


cDNA FLJ13220 fis, clone 
NT2RP4002047, moderately similar to 
GTP-BINDING PROTEIN LEPA. 


1538 


99 


150 


gil0437720 


Homo sapiens 


cDNA:FLJ21595 fis, clone 
COL07069. 


1438 


100 


151 


gi3327080 


Homo sapiens 


mRNA for KIAA0633 protein, partial 
cds. 


6823 


99 


151 


gi857571 


Mus musculus 


cordon-bleu gene product 


1345 


81 


151 


gi6094680 


Homo sapiens 


PAC clone RP5-1 168M19 from 7pl2- 
qll.21, complete sequence. 


1342 


100 


152 


git5451265 


Macaca 
fascicularis 


hypothetical protein 


2728 


98 


152 


AAB41597 


Homo sapiens 


Human ORFX ORF1361 polypeptide 
sequence SEQ ID NO:2722. 


2650 


100 


152 


gi5689443 


Homo sapiens 


mRNA for KIAA1053 protein, partial 
cds. 


2650 


100 


153 


gil4036062 


Homo sapiens 


unnamed protein product 


1930 


100 


153 


AAG81377 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:272. 


1925 


99 


153 


gil2833112 


Mus musculus 


putative 


1727 


88 


154 


gil2832455 


Mus musculus 


putative 


1220 


89 | 


154 


gi!5080314 


Homo sapiens 


Similar to RIKEN cDNA 0610010D20 
gene, clone MGC:20590 
IMAGE:4310241, mRNA, complete 
cds. 


514 


100 


154 


gi6002488 


Penicillium 
chrysogenum 


hypothetical protein 


338 


31 


155 


gil4017889 


Homo sapiens " 


mRNA for KIAA1836 protein, partial 
cds. 


2511 


100 


155 


AAB94592 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15402. 


972 


50 


155 


gil0435321 


Homo sapiens 


cDNA FLJ13337 fis, clone 
OVARC1001880. 


972 


50 


156 


gil4550510 


Homo sapiens 


pseudouridylate synthase 1, clone 


2123 


100 



152 
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MGC:2736 IMAGE:2822709, mRNA, 
complete cds. 






156 


gil2804097 


Homo sapiens 


Similar to pseudouridine synthase 1, 
clone MGC: 1 1268 IMAGE:3943243, 
mRNA, complete cds. 


2123 


100 


156 


gi4455035 


Homo sapiens 


pseudouridine synthase 1 (PUS1) 
mRNA, partial cds. 


1927 


99 


157 


AAY58052 


Homo sapiens 


Human protein kinase H2LAU20 
protein sequence. 


3198 


98 


157 


gi9652080 


Homo sapiens 


protein kinase DYRK4 (DYRK4) 
mRNA, partial cds. 


2844 


100 


157 


AAW71685 


Homo sapiens 


Amino acid sequence of human 
serine/threonine protein kinase. 


1909 


97 


158 


gi7300952 


Drosophila 
melanogaster 


BcDNA:LD21504 gene product 


971 


62 


158 


gi4972728 


Drosophila 
melanogaster 


unknown 


971 


62 


158 


AAB97646 


Homo sapiens 


Ribosomal S3 protein 17. 


831 


99 


159 


AAU02201 


Homo sapiens 


Phosphatase 1 protein-like protein, 
MEM6. 


1514 


100 


159 


gi!5551577 


Homo sapiens 


unnamed protein product 


1514 


100 


159 


AAB95633 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18363. 


1510 


99 


160 


gil2804573 


Homo sapiens 


Similar to CGI 1334 gene product, 
clone MGC:3207 IMAGE:3501899, 
mRNA, complete cds. 


1859 


100 


160 


gil2851419 


Mus musculus 


putative 


1590 


86 


160 


gi7302053 


Drosophila 
melanogaster 


CGI 1334 gene product ! 


1046 


59 


161 


gil580781 


Homo sapiens 


Human beige-like protein (BGL) 
mRNA, partial cds. 


9734 


99 


161 


gil0180266 


Mus musculus 


LBA 


9333 


86 


161 


gil0257401 


Musmuscuhis 


LBA isoform beta 


8920 


86 


162 


gil5082589 


Homo sapiens 


clone MGC:4408 IMAGE:2906200, 
mRNA, complete cds. 


2065 


99 


162 


gil5638615 


Arabidopsis 
thaliana 


HEN1 


350 


37 


162 


gi!3241746 


Arabidopsis 
thaliana 


CORYMBOSA2 


350 


37 


163 


gil5291227 


Drosophila 
melanogaster 


GH13040p 


701 


40 


163 


gi7303780 


Drosophila 
melanogaster 


CG12214 gene product 


701 


40 


163 


AAB95882 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18991. 


501 


100 


164 


gi3327170 


Homo sapiens 


mRNA for KIAA0678 protein, partial 
cds. 


5255 


100 


164 


AAB95304 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17542. 


4431 


99 


164 


gil4134120 


Caenorhabditis 
elegans 


endocytosis protein RME-8 


2127 


42 


165 


AAB53427 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEQ ID NO:967. 


813 


96 
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165 


gil3905098 


Mils musculus 


B-cell translocation gene 1, anti- 
proliferative 


813 


96 


165 


gi293306 


Mus musculus 


B-cell translocation gene-1 protein 


813 


96 


166 


gil3365897 


Macaca 
fascicularis 


hypothetical protein 


2501 


97 


166 


AAY02168 


Homo sapiens 


A fecilitative glucose transporter 
protein GLUT8. 


870 


99 


166 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


835 


39 


167 


gil3365897 


Macaca 
fascicularis 


hypothetical protein 


2173 


97 


167 


AAY02168 


Homo sapiens 


A facilitative glucose transporter 
j>roteinGLUT8. 


870 


99 


167 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


678 


37 


168 


gil0047251 


Homo sapiens 


mRNA for KIAA1588 protein, partial 
cds. 


3292 


100 


168 


gi!4424704 


Homo sapiens 


clone MGC:15071 IMAGE:41 10510, 
mRNA, complete cds. 


2315 


100 


168 


gi4567179 


Homo sapiens 


chromosome 19, BAC 37295 (CIT-B- 
21A4), complete sequence. 


1269 


43 


169 


gil5558943 


Homo sapiens 


guanylate binding protein 4 mRNA, 
complete cds. 


3134 


99 


169 


gill74187 


Mus musculus 


purine nucleotide binding protein 


2260 


70 


169 


gil 93444 


Mus musculus 


guanylate binding protein 


1986 


66 


170 


gil4585859 


Homo sapiens 


hypothetical protein SB 138 


1121 


100 


170 


gi6665778 


Mus musculus 


cyclin ania-6b 


1052 


92 


170 


gil2841169 


Mus musculus 


jmtative 


1052 


92 


171 


AAB64407 


Homo sapiens 


Amino acid sequence of human 
intracellular signalling molecule 
INTRA39. 


3394 


100 


171 


AAB71963 


Homo sapiens 


Human TGF-beta receptor encoded by 
cDNA clone HFEHY04. 


3394 


100 


171 


gil0438113 


Homo sapiens 


cDNA: FLJ21908 fls, clone HEP03830. 


3385 


99 


172 


gil2652533 


Homo sapiens 


clone MGC:2637 IMAGE:3505128, 
mRNA, complete cds. 


676 


89 


172 


AAB67453 


Homo sapiens 


Amino acid sequence of a human 
chaperone polypeptide. 


668 


88 


172 


gi9758421 


Arabidopsis 


geneJd:MHF15.7^imilar to unknown 
protein- 


199 


28 


173 


AAB97025 


Homo sapiens 


Human colon carcinoma suppressor 
gene-related protein. 


1773 


61 


173 


gi9857318 


Homo sapiens 


Asef mRNA for APC-stimulated 
guanine nucleotide exchange factor, 
complete cds. 


1773 


61 


173 


gi8809845 


Homo sapiens 


chromosome 2q22 RhoGEF mRNA, 
complete cds. 


1700 


61 


174 


gil2052828 


Homo sapiens 


mRNA; cDNA DKFZp564N1062 
(from clone DKFZp564N1062); 
complete cds. 


1601 


99 


174 


gi!2850603 


Mus musculus 


putative 


1062 


92 



154 
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174 


AAB94655 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15568. 


671 


100 


175 


gil5080282 


Homo sapiens 


Similar to putative sialogl ycoprotease 
type 2, clone MGC:20293 
IMAGE:4121450, mRNA, complete, 
cds. 


1747 


99 


175 


gil 1071727 


Homo sapiens 


mRNA for putative sialoglycoprotease 
type 2. 


1707 


92 


175 


gil2847276 


Musmusculus 


putative 


1541 


84 


176 


AAB36628 


Homo sapiens 


Human FLEXHT-50 protein sequence 
SEQIDNO:50. 


527 


100 


176 


AAB94208 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14557. 


527 


100 


176 


AAG01512 


Homo sapiens 


Human secreted Drotein. SEO ID NO* 
5593. 


527 


100 


177 


gil5929052 


Homo sapiens 


Similar to RIKEN cDNA 2810442016 
gene, clone MGC:23197 • 
IMAGE:4861869. mRNA. comolete 
cds. 


2084 


100 

luU 


177 


gil 1493 155 


Homo sapiens 


Human DNA sequence from clone 
RP5-852M4 on chromosome 20. 
Contains the gene encoding the HB V 
associated factor, a novel gene similar 
toDrosophilia CG17883, a putative 
novel gene, two CpG islands, ESTs, 
GSSs, and STSs, complete. sequence. 


1952 


100 


177 


gil2840168 


Mus musculus 


putative 


1938 


93 


178 


AAB87034 


Homo sapiens 


Human secreted protein TANGO 339, 
SEQIDNO:3. 


1449 


100 


178 


AAY76266 


Homo sapiens 


Human secreted protein encoded by 
gene 10 fragment 


1449 


100 


178 


AAB87135 


Homo sapiens 


Human secreted protein TANGO 339 
F20Y variant, SEQ ID NO: 139. 


1446 


99 


179 


gi434763 


Homo sapiens 


Human mRNA for KIAA0120 gene, 
complete cds. 


1048 


100 


179 


gil4424677 


Homo sapiens 


transgelin 2, clone MGC: 15279 
IMAGE:4301018, mRNA, complete 
cds. 


1048 


100 


179 


gi9956026 


Homo sapiens 


clone CDABP0035 mRNA sequence. 


1048 


100 


180 


AAB31677 


Homo sapiens 


Amino acid sequence of a human 
protein having a hydrophobic domain. 


2803 


100 


180 


AAE03346 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14, SEQ ID NO: 120. 


2803 


100 


180 


AAE03310 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14, SEQ ID NO:84. 


2803 


100 


181 


AAB41910 


Homo sapiens 


Human ORFX ORF1674 polypeptide 
sequence SEQ ID NO:3348. 


1530 


99 


181 


gi5262467 


Homo sapiens 


mRNA; cDNA DKFZp564I122 (from 
clone DKFZp564I122). 


1530 


99 


181 


gil2849716 


Mus musculus 


putative 


1259 


82 


182 


gi2072972 


Homo sapiens 


Human LI element LI. 25 p40 and 
putative pl50 genes, complete cds. 


497 


53 


182 


AAB64943 


Homo sapiens 


Human secreted protein sequence 


494 


54 
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encoded by gene 7 SEQ ID NO:121. 






182 


gi5070622 


Homo sapiens 


retrotransposon LI insertion in X- 
linked retinitis pigmentosa locus, 
complete sequence. 


494 


53 


183 


AAB59191 


Homo sapiens 


Human NADE. 


217 


47 


183 


gi8452894 


Homo sapiens 


p75NTR-associated cell death executor 
(NADE) mRNA, complete cds. 


217 


47 


183 


gi 189379 


Homo sapiens 


Human unknown protein from clone 
pHGR74 mRNA, complete cds. 


217 


47 


184 


AAB88468 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0263. 


4931 


97 


184 


gil4272788 


Homo sapiens 


unnamed protein product 


4931 


97 


184 


gi577301 


Homo sapiens 


Human mRNA for KIAA0090 gene, 
partial cds. 


4650 


99 


185 


AAG64953 


Homo sapiens 


Human ATP-dependent helicase 
protein 68. 


3169 


100 


185 


gil2052748 


Homo sapiens 


mRNA; cDNA DKFZp564B1023 
(from clone DKFZp564B1023); 
complete cds. 


2716 


100 


185 


gil2836314 


Mus musculus 


putative 


2655 


83 


186 


gil4017781 


Homo sapiens 


mRNA for KIAA1782 protein, partial 
cds. 


2834 


99 


186 


gi4062983 


Mus musculus 


Eos protein 


2747 


95 


186 


gil 1612390 


Homo sapiens 


zinc ringer transcription factor Eos 
mRNA, complete cds. 


2603 


98 


187 


AAB95721 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18592. 


2419 


100 


187 


gil0436538 


Homo sapiens 


cDNAFLJ14153fis, clone 
NT2RM1000092, weakly similar to 
MULTIDRUG RESISTANCE 
PROTEIN 2. 


2419 


100 


187 


gil2248763 


Homo sapiens 


mRNA for SMAP-4, complete cds. 


2323 


96 


188 


gil3278906 


Homo sapiens 


clone MGC:4440 IMAGE:2959536, 
mRNA, complete cds. 


1040 


100 


188 


gil3278819 


Homo sapiens 


clone MGC:2776 IMAGE:2959536, 
mRNA, complete cds. 


1040 


100 


188 . 


AAB95829 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18847. 


618 


79 


189 


gil4602977 


Homo sapiens 


Similar to KIAA0789 gene product, 
clone MGC:16602 IMAGE:41 10708, 
mRNA, complete cds. 


3100 


99 


189 


gi3043570 


Homo sapiens 


mRNA for KIAA0523 protein, partial 
cds. 


2564 


100 


189 


gil4133217 


Homo sapiens 


mRNA for KIAA0789 protein, partial 
cds. 


1463 


49 


190 


gi9717245 


Mus musculus 


cytoplasmic dynein heavy chain 


5569 


98 


190 


gi294543 


Rattus 
norvegicus 


dynein heavy chain 


5557 


98 


190 


gi402528 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


5535 


98 


191 


gil3537204 


Homo sapiens 


mRNA forMAST205, complete cds. 


6834 


98 


191 


gi406058 


Mus musculus 


protein kinase ' 


6343 


86 


191 


gi3882335 


Homo sapiens 


mRNA for KIAA0807 protein, partial 


6300 


98 
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cds. 






192 


gil2847109 


Mus musculus 


putative 


1356 


79 


192 


gil3623271 


Homo sapiens 


Similar to RKEN cDNA 2600005P05 
gene, clone MGC:11321 
rMAGE:3951804, mRNA, complete 
cds. 


1332 


100 


192 


gil2847837 


Mus musculus 


putative 


1170 


76 


193 


gi38149 


Pongo 
pygmaeus 


epsilon-globin 


397 


100 


193 


gi903731 


Gorilla gorilla 


epsilon-globin 


397 


100 


193 


gi903707 


Pan 

troglodytes 


epsilon-globin 


397 


100 


194 


AAB74695 


Homo sapiens 


Human membrane associated protein 
MEMAP-1. 


1799 


100 


194 


AAE01340 


Homo sapiens 


Human gene 22 encoded secreted 
protein fragment, SEQ ID NO:205. 


1799 


100 


194 


gil5929183 


Homo sapiens 


modulator ofapoptosis 1, clone 
MGC:9487 IMAGE:3922055, mRNA, 
complete cds. 


1799 


100 


195 J 


AAG93260 


Homo sapiens 


Human protein HP 10106. 


1769 


100 


195 


gil5029765 


Mus musculus 


RKEN cDNA 2810039M17 gene 


1650 


91 


195 


gi!2849932 


Mus musculus 


putative 


1650 


91 


196 


gil4017843 


Homo sapiens 


mRNA for KIAA1813 protein, partial 
cds. 


3434 


100 


196 


gil5193290 


Homo sapiens 


LAPSER1 (LAPSER1) mRNA, 
complete cds. 


3309 


100 


196 


gi8217421 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-1 08L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
uuuuiiui uuiiiaiiij mc gene xor a novel 
protein similar to rat tricarboxylate 

carrier fhp optip firvr a nnvpl PT)7 
(DHR. GLGF\ domain nmtefn the 

gene for a novel protein similar to 
KIAA0552 KIAA0341 andFumi 
hvpothetical orotein 2. the &ene for a 
novel protein similar to Plasmodium 
POM1 and C. elegans F46G11.1, a 
putative novel gene, the SEMA4G gene 
for semaphorin 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, complete 
sequence. 


3264 


100 


197 


gil458241 


Caenorhabditis 
elegans 


Hypothetical protein B0507.2 


782 . 


39 


197 


gil2832510 


Mus musculus 


putative 


490 


89 


197 


AAB54014 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:466. 


242 


100 


198 j 


gi500747 


Mus musculus 


capping protein beta-subunit, isoform 1 


1440 


98 


198 


gi212902 


Galius gallus 


actin-capping protein Z beta subunit 


1432 


98 


198 


gil2805189 


Mus musculus 


capping protein (actin filament) muscle 


1318 


92 
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Z-line ? beta 






199 


gil4017787 


Homo sapiens 


mRNA for KIAA1785 protein, partial 
cds. 


3195 


100 


199 


gil3436428 


Homo sapiens 


Similar to feminization 1 a homolog 
(C. elegans), clone MGC:421 6 
IMAGE:2957950, mRNA, complete 
cds. 


2197 


64 


199 


gil2836689 


Mus musculus 


putative 


2164 


65 


200 


gi7959811 


Homo sapiens 


PR01167 


389 


100 


200 


gi2736345 


Caenorhabditis 
elegans 


contains similarity to G-coupled protein 
receptors 


69 


33 


200 


gi7504953 


Caenorhabditis 
elegans 


hypothetical protein H22D07.1 - 
Caenorhabditis elegans > 


69 


33 


201 


gil2697975 


Homo sapiens 


mRNA for KIAA1715 protein, partial 
cds. 


2230 


100 


201 


AAB42461 


Homo sapiens 


Human ORFX ORF2225 polypeptide 


1015 


100 


201 


gil2844031 


Mus musculus 


putative 


567 


92 


202 




melanogaster 


\^\j£,ojy gene product 


105 

ISO 


on 


202 


ei 1043 8900 




rTYNJA* PT T994.00 fi« rlrm* 

HRC10983. 


IRA 


Q1 


202 


ei5824430 


vfiCUUl UilUUlUS 

elegans 


vi-»iNr\ EfOL y&-j\j vumcs irum tms 
gene-cDNA EST yk523d4.5 comes 

from tfm apne~rDMA F5?T vlr^^^fK ^ 

comes from this gene~cDNA EST 
yk595gl2.5 comes from this 
gene-cDNA EST yk606gl0.5 comes 
from this gene— cDNA EST yk652G.5 
comes from this gene 




Li 


203 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 433. 


1725 


100 


203 


gi4151807 


Rattus 
norvegicus 


membrane-associated suanvlate kinase- 
interacting protein 2 Maguin-2 


1484 


62 


203 


gi4151805 


Rattus 
norvegicus 


membrane-associated emanvlate kinase- 
interacting protein 1 Maguin-1 


1484 


62 


204 


AAM00844 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 207. 


1051 


98 


204 


gi4151807 


Rattus 
norvegicus 


membrane-associated cuanvlate kinase- 
interacting protein 2 Maguin-2 


779 


69 


204 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


779 


69 


205 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 433. 


1576 


92 


205 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


1349 


57 


205 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


1349 


57 


206 


gi7242969 


Homo sapiens 


mRNA for KIAA1307 protein, partial 
cds. 


8582 


99 


206 


AAM00860 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 223. 


4841 


98 


206 


gi4426611 


Drosophila 


pushover 


2137 


46 
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melanogaster 








207 


AAB62210 


Homo sapiens 


Human ABCA2 transporter protein. 


9835 


99 


207 


gil3173186 


Homo sapiens 


ABC transporter ABCA2 (ABCA2) 
raRNA, complete cds. 


9835 


99 


207 


gi9957467 


Homo sapiens 


ATP-binding cassette sub-family A 
member 2 (ABCA2) mRNA, complete 
cds. 


9835 


99 


208 


AAB94358 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14883. 


2268 


99 


208 


£10434632 


Homo sapiens 


cDNAFU12886fis, clone 
NT2RP2004041, weakly similar to 
SYNAPSINS IA AND IB. 


2268 


99 


208 


£12052738 


Homo sapiens 


mRNA; cDNA DKFZp564H1322 
(from clone DKFZp564H1322); 
complete cds. 


2268 


99 






Homo sapiens 


Human DNA sequence from clone 
RP4-583P15 on chromosome 20 

Pnntoine CQTa OTCn rjCCn on/4 4am 

s^ouioins £a x s, o i os, vjoos ana ten 
CpG islands. Contains the TNFRSF6B 
gciic iui uujjur ucvrosis laciur receptor 
6b (decoy), the 3' part of the 

ICIAA108& pene the ATI PR PI apnp for 
aja/iiuoo gvu&j uic iTJxx xvi x gene; jlui 

ADP-ribosylation factor related protein 

1 two penes for novel nroteins thp 

gene for a GLUT A enhancer factor and 
the Bene for a novel zinc finger nrotein 
similar to rat RIN ZF and the gene for a 
novel BTB/POZ domain containing 
zinc finger protein, complete sequence. 


2074 


99 


209 


£13162677 


Homo sapiens 


GLUT4 enhancer factor mRNA, 
complete cds. 


2055 


98 


209 


£12655101 


Homo sapiens 


clone IMAGE:3 140406, mRNA, 
partial cds. 


1766 


100 


210 


£14279329 


Homo sapiens 


ubiquitin specific protease (USP28) 
mRNA, complete cds. 


4131 


92 


210 


£7959297 


Homo sapiens 


mRNA for KIAA1515 protein, partial 
cds. 


3872 


100 


210 


AAB31552 


Homo sapiens 


A human ubiquitin specific protease 25 
(USP25). 


2058 


48 


211 


AAB36579 


Homo sapiens 


Human FLEXHT-1 protein sequence 
SEQIDNO:L 


1829 


100 . 


211 


AAB94048 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14211. 


1825 


99 


211 


£10433984 


Homo sapiens 


cDNA FU12475 fis, clone 
NT2RM1000962. 


1825 


99 


212 


£15824499 


Homo sapiens 


GalNAc-4-O-sulfotransferase 1 
mRNA, complete cds. 


2238 


100 


212 


£11990885 


Homo sapiens 


GaINAc4ST mRNA for GalNAc 4- 
sulfo transferase, complete cds. 


2238 


100 


212 


£15559803 


Homo sapiens 


carbohydrate (N-acetylgalactosarnine 
4-0) sulfotransferase 8, clone 
MGC:20987 IMAGE:4635405, mRNA, 
complete cds. 


2238 


100 
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213 


AAB43387 


Homo sapiens 


Human ORFX ORF3 15 1 polypeptide 
sequence SEQ ID NO:6302. 


1056 


100 


213 


gil5292317 


Drosophila 
melanogaster 


LD46863p 


549 


50 


213 


gi7302029 


Drosophila 
melanogaster 


CG12054 gene product 


549 


50 


214 


gil2843216 


Mus muscuius 


putative 


913 


84 


214 


gil4585867 


Homo sapiens 


hypothetical protein SB 145 


297 


44 


214 


14388186 


fascicularis 


nypotneucai protein 


295 


A A 

44 


215 


0114111710 


norno sapiens 


mKWA tor isJAAOo33 protein, partial 
cds. 


7195 


An 

99 






Homo sapiens 


Human DNA sequence from clone 
RP3-467L1 on chromosome lp36.21- 

contains tne 5 part 01 gene 
KIAA0833, the VAMP3 gene for 
vc&iuic-dhbucidieu xncinoiaiic protein j> 
(cellubrevin), the PER3 gene for period 

^uiuavjpmia^ iiULHUiUg J ail LI LUC gcllc 

far urotensin EL Contains two putative 
CoG islands EST* STSq and It^c 
complete sequence. 


3642 


99 


215 


AAB42729 


Homo sapiens 


Human ORFX ORF2491 nnlvnentide 

sequence SEQ ID NO:4986. 


Q07 
yy i 




216 


gi7293088 


Drosophila 
melanogaster 


CG9213 eene -oroduct 


811 

Oil 


in 


216 


gi!5810333 


Arabidopsis 
thaliana 


unknown protein 


713 


*o 


216 


gil3324888 


Caenorhabditis 
elegans 


HvDothetical orotein B0361 2 


710 




217 


gi2443331 


Xenopus 
laevis 


Nfrl 


2421 


75 


217 


AAB34944 


Homo sapiens 


Human secreted nrotein Qf»mipno<» 
encoded by gene 20 SEQ ID NO: 148. 


1 147 


01 


217 


gil 5292543 


Drosophila 
melanogaster 


SD06560o 


01 1 
71 1 


JO 


218 


gi7243111 


Homo sapiens 


mRNA for KIAA1165 nrntein nnrrial 

cds. 




inn 


218 


gil657758 


Rattus 
norvegicus 


densin-180 




01 ! 


218 


gi8570180 


Rattus 
norvegicus 


densin-1 80 variant D 


1250 


83 


219 


gil4017839 


Homo sapiens 


mRNA for KIAA18 1 1 protein, partial 

cds. 


1726 


80 


219 


gi3217028 


Homo sapiens 


mRNA for putative serme/threonine 
protein kinase, partial. 


1450 


84 


219 


gi7294217 


Drosophila 
melanogaster 


CG61 14 gene product 


1055 


70 


220 


gi7297674 


Drosophila 
melanogaster 


CG13 139 gene product 


942 


75 


220 


gil2857050 


Mus muscuius 


putative 


767 


62 


220 


gil5636900 


Gallus gallus 


avEna neural variant 


139 


52 


221 


gil5489242 


Homo sapiens 


clone IMAGE:3859726, mRNA, 


1001 


88 
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partial cds. 






221 


gil3543991 


Homo sapiens 


clone IMAGE:3627860, mRNA, 
partial cds. 


1001 


88 


221 


gil2847182 


Mus musculus 


putative 


328 


39 


222 


gil4133209 


Homo sapiens 


mRNA for KIAA0654 protein, partial 
cds. 


6089 


99 


222 


gi930343 


Homo sapiens 


Human LAR-interacting protein lb 
mRNA, complete cds. 


3559 


60 


222 


ci930341 




FLU man JU/ixv-LUUSIaCllIlg prOIClH la 
lliTvl^/x, lOujpUHC (/lib. 




ou 


223 


gil2620207 


Homo sapiens 


Clorf25 mRNA, complete cds. 


3807 


98 


223 




TTftmrt cafii'pnc 


rimnaD ujn a sequence rxom cione 
GS1-120K12 on chromosome lq25.3- 

^ 1 / li\ntflinQ f Vicv <*p|«a ■fr*f n'nn fin cr/>-r 
Ji'A'. vUUUUUa UAG gCUG lUl IlUg JLLUHCr 

protein DING or BAP-1, an FTH1 

( ferritin heaw nnlvnpntfrl** 1 ^ 

pseudogene, the 3' end of the gene for a 
novel protein similar to archaeal, yeast 
and womiN2,N2-<limethylguanosine 
tRNA methyltransferase, ESTs, STSs, 
GSSs and two putative CpG islands, 
complete sequence. 




AO 


223 


giI2835704 


Mus musculus 


putative 


1420 


oo 


224 


gil4595658 


Xenopus 
laevis 


UM protein prickle 


2865 


67 


224 


gil0727796 


Drosophila 
melanogaster 


esn gene product 


698 


42 


224 


gi6634092 


Drosophila 
melanogaster 


LIM -domain protein 


698 




225 


gil3375149 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 1 18M15 on chromosome 20 
Contains part of a gene similar to P14 
Bos taurus (P14L), a novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


957 


99 


225 


gi7259265 


Mus musculus 


contains transmembrane (TM) region 


314 


50 


225 


AAY53871 


Homo sapiens 


A human brain-derived signalling 
factor polypeptide. 


299 


45 


226 


gil2803987 


Homo sapiens 


clone MGC:4174 HMAGE:3634226, 
mRNA, complete cds. 


743 


100 


226 


gil2805417 


Mus musculus 


Unknown (protein for MGC:7354) 


444 


66 


226 


gil2849498 


Mus musculus 


putative 


235 


72 


227 


AAY91629 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 23 SEQ ID NO:302. 


1391 


87 


227 


gj7677403 


Homo sapiens 


F-box protein FBG2 (FBG2) mRNA, 
complete cds. 


1391 


87 


227 


AAY83046 


Homo sapiens 


F-box protein FBP-6. 


1333 


82 


228 


gil5079958 


Homo sapiens 


chromosome 1 1 open reading frame 
24, clone MGC:19741 
IMAGE:3614861, mRNA, complete 
cds. 


2231 


99 


228 


gil 1527205 


Homo sapiens 


DM4E3 (CI lor£24) mRNA, complete 
cds. 


2224 


99 
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228 


AAB 18965 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2055 


99 


229 


gil5930199 


Homo sapiens 


Similar to RIKEN cDNA 4921523118 
gene, clone MGC:9467 
IMAGE:3914747, mRNA, complete 
cds. 


1451 


99 


229 


gil3278594 


Mus musculus 


RIKEN cDNA 4921523118 gene 


1440 


97 


229 


gil2856904 


Mus musculus 


putative 


1440 


97 


230 


gil5680131 


Homo sapiens 


hypothetical protein FLJ12171 , clone 
MGC: 19889 1MAGE4652087 mRNA- 
complete cds. 


1638 


100 


230 


gil4043242 


Homo sapiens 


hvDothetical nrotefn FT JT1 2 T 7 1 rlnne 
MGC:15694 IMAGE:3351601, mRNA, 
complete cds. 


1638 


L\J\J 


230 


AAB93912 


Homo sapiens 


Human protein sequence SEQ ED 
NO: 13880. 


1634 


99 


231 


AAB56947 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1525. 


779 


100 


231 


AAB68408 


Homo sapiens 


Amino acid sequence of a human 
NOV1 polypeptide. 


574 


100 


231 


AAY81695 


Homo sapiens 


Human PIN protein sequence. 


574 


100 


232 


gill 138034 


Homo sapiens 


mRNA for KIAA1 173 protein, 
complete cds. 


2665 


100 


232 


AAG89259 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
379. 


2654 


99 


232 


gil2834372 


Mus musculus 


putative 


2427 


90 


233 


AAB98612 


Homo sapiens 


Human tumour suppressor gene, 
TSG16, protein. 


1706 


55 


233 


gil 1596412 


Homo sapiens 


GAC-1 (GAC-1) mRNA, complete cds. 


893 


77 


233 


gi4240237 


Homo sapiens 


mRNA for KIAA0874 protein, partial 
cds. 


893 


77 


234 


AAB41108 


Homo sapiens 


Human ORFX ORF872 polypeptide 
sequence SEQ ID NO: 1744. 


4170 


99 


234 


gi6331287 


Homo sapiens 


mRNA for KIAA1274 protein, partial 
cds. 


3936 


99 


234 


gil545959 


Mus musculus 


paladin 


3560 


80 


235 


gi9368849 


Homo sapiens 


mRNA; cDNA DKFZp761G2113 
(from clone DKFZd761G21131 


972 


99 


235 


gi7293878 


Drosophila 
melanogaster 


CG13379 gene product 


274 


36 


235 


gil4532482 


Arabidopsis 
tfaaliana 


AT5g58570Anznl_20 


152 


31 


236 


gi3242242 


Mus musculus 


hyperpolarization-activated cation 
channel, HAC2 


4309 


91 


236 


gi7407645 


Rattus 
norvegicus 


hyperpolarization-activated, cyclic 
nucleotide-gated potassium channel 1 


4306 


91 


236 


gi2708316 


Mus musculus 


brain cyclic nucleotide gated 1 ; Bcng- 
1; brain specific ion channel protein 


4301 


91 


237 


AAB13370 


Homo sapiens 


Human brain-associated protein 
HBAP-1. 


1055 


100 


237 


gi9944291 


Homo sapiens 


TTYH1 mRNA, complete cds. 


1055 


100 


237 


gi9651109 


Macaca 
fascicularis 


TTYH1 


1032 


98 
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238 


AAU00476 


Homo sapiens 


Human INTERCEPT 400 protein. 


1428 


100 


238 


AAY79266 


Homo sapiens 


Human elongase homologue HS3. 


1428 


100 


238 


AAB29648 


Homo sapiens 


Human membrane-associated protein 
HUMAP-5. 


1428 


100 


239 


AAB84885 


Homo sapiens 


Human protein, SEQ ID 14. 


4029 


99 


239 


AAB84882 


Homo sapiens 


Human protein, SEQ ID 6. 


4029 


99 


239 


gi5262593 


Homo sapiens 


mRNA; cDNA DKFZ*>434N093 (from 
clone DKFZp434N093); partial cds. 


3684 


99 


240 


gil3477247 


Homo sapiens 


Similar to RKEN cDNA 
5031400M07 gene, clone MGC: 13079 
IMAGE:3840918, mRNA, complete 
cds. 


21S3 


100 


240 


AAB18987 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2148 


99 


240 


gi7670425 


Mus muscuhis 


unnamed protein product 


1904 


89 


241 


AAG63222 


Homo sapiens 


Amino acid seauence of a human linid 
metabolism enzyme. 


2194 


100 


241 


gil4861069 


Mus muscuhis 


ohosnhatidvl inositol nhnsnhate kin as** 

type II gamma 


2120 


OS 


241 


gi3387798 


Rattus 
norvegicus 


ohosDhatidvlinositol 5 -nhosnh ate 4- 
kinase gamma 


2087 




242 


gi7295732 


Drosophila 
melanogaster 


ft gene product 


2915 


39 


242. 


gil57409 


Drosophila 
melanogaster 


fat protein 


2901 


39 


242 


gil0727403 


Drosophila 
melanogaster 


ds gene product 


2236 


34 


243 


AAF90315 aa 
2 


Homo sapiens 


Winged helix/zinc finger transcription 
factor FOXPlcDNA. 


819 


98 


243 


AAB82339 


Homo sapiens 


Winged helix/zinc finger transcription 
factor FOXP1. 


819 


98 


243 


gil2043714 


Homo sapiens 


clone pAB195 FOXP1 (FOXP1) 
mRNA, complete cds. 


819 


98 


244 


gil0440073 


Homo sapiens 


cDNA- FLJ23399 fis clone HEP 18254 




Iflrt 

lw 


244 


gi7018524 


Homo sapiens 


mRNA; cDNA DKFZp762K137 (from 
clone DK"FZn7621<C1 17V narrial rJln 


2524 


100 


244 


gil4133227 


Homo sapiens 


mRNA for KIAA0970 protein, partial 
cds. 


1367 


51 


245 


AAB94855 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16042. 


1347 


100 


245 


gil0436290 


Homo sapiens 


cDNA FU13968 fis, clone 
Y79AA1001493, weakly similar to 
UBIQU1TIN-CONJUGATING 
ENZYME E2-17 KD 9 (EC 6.3.2.19). 


1347 


100 


245 


gil6198439 


Homo sapiens 


hjpothetical protein FLJ13855, clone 
MGC:16842 IMAGE:3915698, mRNA, 
complete cds. 


1347 


100 


246 


gi6330302 


Homo sapiens 


mRNA forKIAA1185 protein, partial 
cds. 


2041 


100 


246 


AAG74603 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5367. 


1530 


97 


246 


AAB53321 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEQ ID NO:861 . 


1530 


97 
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247 


gi535390 


Macronuclear 
Homo sapiens 


Human cellular retinol binding protein 
II (CRBPII) mRNA, complete cds. 


715 


99 


247 


gi397352 


Mus musculus 


mCRBPII 


674 


91 


247 - 


gil2833902 


Mus musculus 


putative 


669 


90 


248 


AAG01285 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5366. 


209 


87 


248 


AAR05562 


Homo sapiens 


Laminin -binding protein encoded by 
insert from J9 lambda gtlO phage. 


209 


87 


248 


gil 149509 


Gallus gallus 


37kD I^minin receptor precursor /p40 
ribosomal associated protein 


209 


87 


249 


gil3162226 


Homo sapiens 


Human DNA sequence from clone 
RP4-543J19 on chromosome 20 
Contains part of the GNAS1 gene 
encoding guanine nucleotide binding 
protein (G protein, alpha stimulating 
activity polypeptide 1) including 
neuroendocrine secretory protein 55 
(NESP55), me CTSZA gene encoding 
cathepsin Z, the ATP5E gene encoding 
ATP synthase (H+ transporting, 
mitochondrial Fl complex, epsilon 
subunit), the gene encoding protein 
HSPC130 (THI Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBB1), a gene encoding the CGI- 
107 protein (LOC5 1012), four CpG 
islands, ESTs, STSs and GSSs, 
complete sequence. 


1591 


100 


249 


gil 1230445 


Homo sapiens 


TUBB1 gene for human beta tubulin 1, 
class VL 


1591 


100 


249 


gi212834 


Gallus gallus 


beta-tubulin 


1340 


85 


250 


gil3 162226 


Homo sapiens 


Human DNA sequence from clone 
RP4-543 J19 on chromosome 20 
Contains part of the GNAS1 gene 
encoding guanine nucleotide binding 
protein (G protein, alpha stimulating 
activity polypeptide 1) including 
neuroendocrine secretory protein 55 
(NESP55), the CTSZA gene encoding 

cathensin 7 tflP ATPSF crmA pnrnriino 

ATP synthase (H+ transporting, 
mitochondrial Fl complex, epsilon 
subunit), the gene encoding protein 
HSPC130 (THI Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBB 1), a gene encoding the CGI- 
107 protein (LOC5 1012), four CpG 
islands, ESTs, STSs and GSSs, 
complete sequence. 


1986 


100 


250 


gil 1230445 


Homo sapiens 


TUBB1 gene for human beta tubulin 1, 
class VL 


1986 


100 


250 


gi212834 


Gallus gallus 


beta-tubulin 


1699 


85 


251 


gi559325 


Homo sapiens 


Human mRNA for ATP synthase alpha 


1566 


99 
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subunit, complete cds. 






251 


gi559317 


Homo sapiens 


Human gene for ATP synthase alpha 
subunit, complete cds (exon 1 to 12). 


1566 


99 


251 


gi34468 


Homo sapiens 


H.sapiens mRNA for mitochondrial 
ATP synthase. 


1566 


99 


252 


gi559325 


Homo sapiens 


Human mRNA for ATP synthase alpha 
subunit, complete cds. 


2192 


84 


252 


gi559317 


Homo sapiens 


Human gene for ATP synthase aloha 
subunit, complete cds (exon 1 to 12). 


2192 


84 


252 


gi34468 


Homo sapiens 


H.sapiens mRNA for mitochondria} 
ATP synthase. 


2192 


84 


253 


gil4550508 


Homo sapiens 


Similar to CG8974 eene oroduct clone 
MGC:2460 IMAGE:2964524 mRNA. 
complete cds. 


1051 




253 


gil5928691 


Mus musculus 


Unknown (protein for MGC: 19394) 


1036 


98 


253 


gi7293133 . 


Drosophila 
melanogaster 


CG8974 gene product 


608 


66 


254 


AAE04880 


Homo sapiens 


Human Dro tease nrotein-7 




100 


254 


gil4043577 


Homo sapiens 


hypothetical protein FU12455, clone 
MGC:13149 IMAGE:4298740, mRNA, 
complete cds. 


2795 


100 


254 


AAB94023 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14157. 


2781 


99 


255 


gi2501855 


Homo sapiens 


22 kDa actin-binding protein (SM22) 
gene, complete cds. 


937 


95 


255 


gi2340833 


Homo sapiens 


DNA for SM22 alpha, complete cds. 


937 


95 


255 


gi2335047 


Homo sapiens 


mRNA for SM22 alpha, complete cds. 


937 


95 


256 




nomo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20261 IMAGE:3029407, mRNA, 
complete cds. 


1948 


99 


256 


£16706658 


nouio sapiens 


Human jjina sequence rrom clone 
RP1-101K10 on chromosome 6q25-26. 
Contains a novel gene, the gene for a 
novel protein similar to Prokaryotic- 
type class I peptide chain release 
fectors, the 3* end of gene RGS17 
(RGSZ2) for regulator of G-protein 
signaling 17, ESTs, STSs, GSSs and 
two putative CpG islands, complete 
sequence. 


1 Ail A 

1940 


99 


256 


gil5680165 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20252 IMAGE:4646472, mRNA, 
complete cds. 


1375 


98 


257 


gil5080204 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20261 IMAGE:3029407, mRNA, 
complete cds. 


1706 


90 


257 


gi6706658 


Homo sapiens 


Human DNA sequence from clone 
RP1-101K10 on chromosome 6q25-26. 
Contains a no vel gene, the gene for a 
novel protein similar to Prokaryotio 


1698 


89 
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iypc viooo x pepuue Vila in icicaoC 

factors, the 3' end of gene RGS17 
fRGSZ2^ for reeulatOT of G-r»rnt<nn 

lJi.tc* J lUl iWgUJAlVrl wl VJ LflUlGlll 

signaling 17, ESTs, STSs, GSSs and 

two mrtative OnG inland q cnTrtnl<*tp 
vvvv |/uuiutv v^|/v* mttuW) wiiiusiwic 

sequence. 






257 


gil5680165 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20252 IMAGE:4646472, mRNA, 

cnrnnleti* cA<t 

VUllipivIA vUd< 


1133 


85 


258 


ri7295482 


Dro scvnti \\ a 
melanogaster 




OlO 


A1 

m 


258 


gil2322327 


Arabidopsis 
thaliana 


unknown protein 


451 


46 


258 


ei9454545 


Amninnmcic 
jrtJ. ox/iuup 5 lo 

thaliana 


uiiKiio wn proicin 


431 


AC 


259 


AAR95307 

AAJD7JJV / 


rinmn conionc 

nuiuu bdpiciio 


xiuxnan protein sequence onv^ id 
NO: 17548. 


Mill 


1 AA 

100 


259 


ail 4042477 


no 1 LIU b dp lens 


cuina ri_j i4/4U ris, clone 

NT9RP^nn?fiO? wpnlflv similar f/\ 

PROBABLE PROTEIN DISULFIDE 
ISOMERASE ER-60 PRECURSOR 
(EC 5.3.4.1). 


CA1 1 

3011 


1 A A 

100 


259 


gil5862252 


Homo sapiens 


IIHIIhIIIvVI UIVICIU 17lV/UU4yl> 


JUuO 


OQ 


260 


gil5079416 


Homo sapiens 


secreted modular calcium-binding 

nrotem 1 clone A/ffSf^l QR 0^ 

1MAGE:4549051, mRNA, complete 
cds. 


2359 


100 


260 


AAB19394 


Homo sapiens 


Amino acid sequence of a human 
secreted protein. 


2355 


99 


260 


gil0432431 


Homo sapiens 


mRNA for secreted modular calcium- 
binding protein (smocl gene). 


2343 


99 


261 


gi7020475 


Homo sapiens 


cDNA FLJ20400 Us, clone KAT00587. 


1687 


100 


261 


gill 18097 


Caenorhabditis 
elegans 


proline and glycine-rich 


268 


33 


261 


AAW49723 


Homo sapiens 


Protein polymer adhesive substrate 
PPAS1-F. 


261 


32 


262 


gil6197949 


Drosophila 
melanogflster 


LD21896p 


325 


29 


262 


gi7293303 


Drosophila 
melanogaster 


CG9089 gene product 


325 


29 


262 


gi3 170539 


Takirugu 
rubripes 


unknown 


291 


40 


263 


AAB42525 


Homo sapiens 


Human ORFX ORF2289 polypeptide 
sequence SEQ ID NO:4578. 


3570 


80 


263 


gi2887497 


Homo sapiens 


chromosome 19, overlapping cosmids 
R28707 and R34001, complete 
sequence. 


3570 


80 


263 


AAB42538 


Homo sapiens 


Human ORFX ORF2302 polypeptide 
sequence SEQ ID NO:4604. 


2835 


99 


264 


gil4017849 


Homo sapiens 


mRNA for KIAA1816 protein, partial 
cds. 


1637 


99 


264 


gi8655687 


Homo sapiens 


mRNA; cDNADKFZp762E1511 


892 


100 
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(from clone DKFZp762E151 1). 






264 


gi6979930 


Homo sapiens 


Maml mRNA, partial cds. 


315 


30 


265 


gil2836420 


Mus musculus 


putative 


2511 


93 


265 


gi!0437002 


Homo sapiens 


cDNA; FU21013 fis, clone 
CAE05223. 


1859 


99 


265 


AAB58385 


Homo sapiens 


Lung cancer associated polypeptide 
sequence SEQ ID 723. 


1704 


99 


266 


gil4198321 


Mus musculus 


ribosomal protein L3 1 


543 


92 


266 


gi57115 


Rattus 
norvegicus 


ribosomal protein L31 (AA 1-125) 


543 


92 


266 


gil4586963 


Mus musculus 


M75 


543 


92 


267 


gil78424 


Homo sapiens 


Human apolipoprotein A-II mRNA, 
complete cds. 


478 


96 


267 


gi296634 


Homo sapiens 


Human gene for apolipoprotein AIL 


478 


96 


267 


gi296633 


Homo sapiens 


Human FVNJ A fhr a nnl fnnnrr\+»»"ir» A TT 


**/o 




268 


AAB47184 


Homo sanien<; 






1 AA 
100 


268 


gi7321168 


Homo sapiens 


Human DNA sequence from clone 
RP5-860F19 on chrnmommp ?On1? 7 
13 Contains the cene for KIAA1442 
(similar to olfactory neuronal 
transcription factors (COE1, COE2, 
COE3, EBF3, OLF1)), RPL19 (60S 
ribosomal protein LI 9) and HSPC080 
pseudogenes, the gene for 
metallocarboxypeptidase (CPX-1) and 
a novel gene. Contains ESTs, STSs, 
GSSs and four CpG islands, complete 
sequence. 


3571 


100 


268 


AAB36174 


Homo sapiens 


Human APG04 protein. 


3567 


99 


269 


gi2314829 


Homo sapiens 


jerky gene product homolog mRNA, 
complete cds. 


1430 


59 


269 


gil0140857 


Mus musculus 


jerky 1 


752 


33 


269 


AAG62624 


Homo sapiens 


Human cell nucleus regulatory protein 
56. 


598 


34 


270 


gi7959227 


Homo sapiens 


mRNA for KIAA1483 protein, partial 
cds. 


2231 


99 


270 


gi34192 


Homo sapiens 


Human KUP mRNA for protein with 
two zinc fingers. 


627 


39 


270 


_gil33 10782 


Mus musculus 


myoneunn 


315 


24 


271 1 


AAB93814 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13604. 


1408 


97 


271 


gil0433080 


Homo sapiens 


cDNA FIJI 1753 fis, clone 
HEMBA1 005583. 


1408 


97 


271 


AAB41771 


Homo sapiens 


Human ORFX ORF1535 polypeptide 
sequence SEQ ID NO:3070. 


821 


99 


272 


gi7959197 


Homo sapiens 


mRNA for KIAA1468 protein, partial 
cds. 


4603 


100 


272 


gil5080502 


Homo sapiens 


clone MGC:16944 IMAGE:4339646, 
mRNA, complete cds. 


4317 


94 


272 


gi9755831 


Aiabidopsis 
thaliana 


putative protein 


675 


27 


273 


gil50805Q2 


Homo sapiens 


clone MGC:16944 1MAGE.-4339646, 
mRNA, complete cds. 


4362 


98 
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273 


gi7959197 


Homo sapiens 


mRNA for KIAAI468 protein, partial 
cds. 


4360 


96 


273 


gi9755831 


Arabidopsis 

tha liana 


putative protein 


704 


28 


274 


AAB92483 


Homo sapiens 


Human protein sequence SEQ ID 
NO:l0570. 


2626 


100 


274 


gi7021875 


Homo sapiens 


cDNAFUl005lfis, clone 
HEMBA1001281. 


2626 


100 


274 


gil2837616 


Musmuscuhis 


putative 


2065 


90 


275 


gil07 16076 


Homo sapiens 


mRNA for testis -abundant finger 
protein, complete cds. 


2739 


100 


275 


gil4043332 


Homo sapiens 


Similar to ring fincer nrotem 23 clone 
MGC:2475 EMAGE:305l389, mRNA, 
complete cds. 


2533 


94 


275 


gil0716078 


Mus mus cuius 


te s tis - abundant finger protein 


2407 


✓A 


276 


AAB44673 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 33 SEQ ID NO:l38. 


1014 




276 


gil747 


Oryctolagus 
cuniculus 


trichohyalin 


213 


22 


276 


gil3936996 


Human 
herpesvirus 8 


ORF73 


203 


22 


277 


AAG74326 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5090. 


1101 


100 


277 


AAB56461 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:l039. 


778 


100 : 


277 


gil2842930 


Mus museums 


putative 


688 


90 


278 




tiomo sapiens 


xiuman jjjna brnoing protein (HPrZ) 
mRNA, complete cds. 


1528 


47 


278 




nuiiio sapiens 


Human DNA sequence from clone 
RP1-54B20 on chromosome Xpl LI- 
it. j. contains tne d ena ox a novel 
SSX family protein gene, two novel 

jvivad dux. vuniauung k^jltxl type Zinc 

finger protein genes, a KRAB box 

nmtein nseiiflno'PTip fhf* o^tip fV>r a 

novel protein similar to lysozyme C 

( 1 .4-beta-N-acetvl rnurami Ha fhf* 

ZNF81 gene for zinc finger protein 81 
(HFZ20), ESTs, STSs, GSSs and three 
CpG islands, complete sequence. 


1497 


55 


278 


gi498152 


Homo sapiens 


Human mRNA for KIAA0065 gene, 
partial cds. 


1495 


46 


279 


gi2914676 


Homo sapiens 


chromosome 16, cosmid clone 360H6 
(LANL), complete sequence. 


882 


35 


279 


gil4250678 


Homo sapiens 


clone MGC:10489 MAGE:3945548, 
mRNA, complete cds. 


882 


35 


279 


gi2342506 


Homo sapiens 


mRNA for zinc finger protein FPM3 1 5, 
complete cds. 


875 


35 


280 


gi434779 


Homo sapiens 


Human mRNA for KIAA01 12 gene, 
partial cds. 


2072 


100 


280 


gi!5278392 


Homo sapiens 


homolog of yeast ribosome biogenesis 
regulatory protein RRS1, clone 
MGC.-4831 IMAGE.3603972, mRNA, 


1905 


100 
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complete cds. 






280 


gil2804751 


Homo sapiens 


Similar to regulator for ribosome 
resistance homolog (S. cerevisiae), 
clone MGC:2755 IMAGE:2824034, 
mRNA, complete cds. 


1905 


100 


281 


AAB95761 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18686. 


789 


100 


281 


AAG81272 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:62. 


789 


100 


281 


gil4035852 


Homo sapiens 


unnamed protein product 


789 


100 


282 


gil5080911 


Homo sapiens 


nefk-nnlv/'A^ nnlvmpraQp rrTRlsJA 

complete cds. 


^707 


oo 
yy 


282 


gil5384858 


Homo sapiens 


mRNA for poly(A) polymerase gamma 


3797 


99 


282 


gil3641252 


Homo sapiens 


SRP RNA 3' adenylaring enzyme/pap2 
mRNA, complete cds. 


3779 


99 


283 


gi6807698 


Homo sanierK 


(from clone DKFZp434A1014); partial 
cds. 


1/1*37 


Of 


283 


gil2853788 


Mus mus cuius 


putative 


408 


38 


283 


gi4468790 


Xenoous 
laevis 


cnpp/lv T\Tr\tf*in 


1 SA 
10*# 




284 


gi3327062 


Homo sapiens 


mRNA for KIAA0624 nrntein narrinl 

cds. 


lvl ly 


00 


284 


gil3702612 


Staphylococcal 
s aureus subsp. 
aureus N3 15 


ORFID:SA2447~hypothetical protein, 
similar to streptococcal hemagglutinin 
protein 


223 


19 


284 


gil4248429 


Staphylococcal 
s aureus subsp. 
aureus Mu50 


hypothetical protein 


223 


19 


285 


gil2697941 


Homo sapiens 


mRNA for KIAA1698 protein, partial 

tus. 


4716 


100 


285 


gi7299794 


Drosophila 
melanogaster 


CG9591 gene product 


290 


31 


285 


AAR99256 


TTftTYlA can tone 

-txUIUU sapiens 


Natural killer lytic associated protein. 


92 


40 


286 


AAG62395 


Homo sapiens 


Human zinc finger protein 46. 


2375 


100 


286 


ffi7576274 


nomo sapiens 


Human DNA sequence from clone 
RP1 1-393J16 on chromosome 10. 
■oontams pan or me ^inx* j3A gene tor 
zinc finger protein 33a (KOX 31), a 
novel gene for a novel KRAB box 
containing zinc finger gene, a zinc 
finger pseudogene, ESTs, STSs, GSSs 
and two putative CpG islands, complete 
sequence. 


2015 


100 


286 


gi881564 


Homo sapiens 


Human zinc finger containing protein 
ZNF157 (ZNF157) mRNA, complete 
cds. 


1339 


51 


287 


gi2822143 


Homo sapiens 


chromosome 19, cosmid R30217, 
complete sequence. 


1838 


53 


287 


gi9968290 


Homo sapiens 


mRNA for zinc ringer protein (ZNF304 
gene). 


1735 


50 


287 


gil3543419 


Homo sapiens 


Similar to zinc finger protein 304, | 1735 


51 
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clone MGC;4079 IMAGE:3530863, 
mRNA, complete cds. 






288 


gi540469 


Homo sapiens 


(clone HGT26) T cell receptor gamma- 
chain mRNA, V region. 


399 


91 


288 


gi3047024 


Homo sapiens 


T-cell recentor gamma VI eene region. 


384 


100 


288 


gi339167 


Homo sapiens 


Human T-cell receptor rearranged 
gamma-chain eene V-reeion CV4) 
(subgroup I). 


384 


100 


289 


AAY69976 


Homo sapiens 


DHFR-HM protein. 


886 


93 


289 


gi!82724 


Homo sapiens 


Human dihvdrofolate reductase cene 


886 


93 


289 


gil82717 


Homo sapiens 


Human dihvdrofolate reductase eene 
exon 6 and 3' flank. 


886 


93 


290 


AAE01782 


Homo sapiens 


Human gene 13 encoded secreted 
protein HDPNW93, SEQ ro NO:103. 


4269 


99 


290 


gil0437433 


Homo sapiens 


cDNA: FU21347 fls clone 
COL02724. 


4127 


97 


290 


AAB74693 


Homo sapiens 


Human protease and protease inhibitor 
PPIM-26. 


3948 


99 


291 


gi6681662 j 


Mus mus cuius 


ENH3 




on 


291 


^gil2844277 


Musmusculus 


putative 


800 


79 


291 


AAY12510 


Homo satriens 


Human 5* PRT ^ftrr^tert r»rnt*»in SPjO TTi 

NO:541. 


OHO 


yy 


292 


AAB47327 


Homo sarjiftti*? 


FCTR4. 


010% 1 
a /yo 


OR 

yo 


292 


gil5141735 


Homo sapiens 


unnamed protein product 


2798 


98 


292 


ei9663126 


Homo ^aniPTK 

1 11111111 gaUlvlli) 


mR M A ■fr\r rliTranncnntP 1 7 /vnnn 
UUVl^A XUi viUUIUUoUIIIC J.^ UL/CH 

reading frame ^ fP12nrfV\ 


lid 




293 


gil0440367 


Homo sapiens 


mRNA for FU00018 protein, partial 
cds. 


5938 


100 


293 


gil5488570 


Homo sapiens 


Similar to hypothetical protein 
FO00018 clone MGC* 10073 
IMAGE:3896004, mRNA, complete 
cds. 


4736 


99 


293 


gil0438857 


Homo sapiens 


cDNA: FU22458 fis, clone 
HRC10001. 


1570 


99 


294 


AAB08948 


Homo sapiens 


Human secreted protein sequence 
encoded bv eene 2 1 SEO ID NO* 1 05 


1601 


99 


294 


AAB08911 


Homo sapiens 


Human secreted protein sequence 
encoded bv cene 21 SEO ID NO* 68 


1601 


99 


294 


AAB80238 


Homo sapiens 


Human PR0238 protein. 


641 


44 


295 


AAB18457 


Homo sapiens 


A human TANGO 216 polypeptide 
clone. 


2106 


98 


295 


AAB18447 


Homo sapiens 


Amino acid sequence of human 
TANGO 21 6 polypeptide. 


2106 


98 


295 


gil4017381 


Homo sapiens 


tumor endothelial marker 8 precursor 
(TEM8) mRNA, complete cds. 


1231 


57 


296 


gil4388342 


Macaca 
fascicularis 


hypothetical protein 


3833 


92 


296 


gi7243195 


Homo sapiens 


mRNA for KIAA1407 protein, partial 
cds. 


3817 


100 


296 


gil5451319 


Macaca 
fascicularis 


hypothetical protein 


2408 


91 


297 


gi7243039 


Homo sapiens 


mRNA for KIAA1329 protein, partial 
cds. 


4761 


100 
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IHonh'hr 


297 


gil2007720 


Mus musculus 


VPS 10 domain receptor protein 
SorCS2 


4466 


88 


297 


£7715916 


Mus musculus 


SorCSb splice variant of the VPS10 
domain receptor SorCS 


2177 


47 


298 


AAM00812 




T-Tlimar» Ivinp marmur nr/n fp»in QT7/*^ T T"i 

rxuiiiaii uuiic iuaxivw pruicin, uCy XXy 

NO: 175. 


InOo 


GO 


298 


gil2846045 


Mus musculus 


putative 


1387 


65 


298 


AAM00925 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 401. 


996 


100 


299 


gi7298852 


Drosophila 
melanogaster 


CG10068 gene product 


609 


43 


299 


gi8655669 


Homo sapiens 


mRNA; cDNA DKFZp547C176 (from 
clone DKFZp547C176). 


482 


52 


299 


AAB42048 


Homo sapiens 


Human ORFX ORF1812 polypeptide 
sequence SEQ ID NO:3624. 


325 


46 


300 


gil4043285 


Homo sapiens 


Similar to KIAA0808 gene product, 
clone MGC:15880 IMAGE:3529159, 
mRNA, complete cds. 


1306 


97 


O A A 

300 


gi7263912 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 108D1 1 on chromosome 20ql2- 
13.1 1 Contains part of the gene for a 
novel protein similar to C. elegans 
T22C1.7, part of the gene for a novel 
HMG (high mobility group) box 
protein similar to K3AA0737, 

VTA A Hone an/1 THTDf^O /"/"i A T^TICW 

jsjaauouh ana inkuv (OAvjryj, 
Cu is, oi os, \jroos ana two putative 


797 


96 


300 


gi3882337 


Homo sapiens 


mRNA for KIAA0808 protein, 

O ATTYtVl f*'frp fvlc 


767 


55 


301 


gil5430292 


Homo sapiens 


muscle alpha-kinase (MAK) mRNA, 
complete cds. 


5445 


99 


301 


ei7243G41 


xaujluu Sapiens 


nusiNA ior isJAAi protein, partial 
cds. 


A All 

4933 


1 An 

100 


301 




1V1US DXUSCUluS 


myocytic induction/differentiation 
originator 


3684 


72 


302 


gil4550508 


Homo sapiens 


Similar to CG8974 gene product, clone 
MGC:2460 IMAGE.2964524, mRNA, 
complete cds. 


589 


100 


302 


gi!5928691 


Mus musculus 


Unknown /orotein for MGC* 1 93 


j if 


07 


302 


gi2564951 


Mus musculus 


unknown 


378 


72 


303 


gi7242955 


Homo sapiens 


mRNA for KIAA1300 protein, partial 
cds. 


9573 


99 ! 


303 


gi6599162 


Homo sapiens 


mRNA; cDNA DKFZp434N1272 
(from clone DKFZp434N1272); partial 
cds. 


1392 


98 


303 


AAG75083 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5847. 


628 


92 


304 


gil408209 


Homo sapiens 


Human endogenous retrovirus HERV- 
K(HML6) proviral clone HML6.17 
putative polymerase and envelope 
genes, partial cds, and 3'LTR. 


398 


86 


304 


gi2801455 


Mouse 


Prl60 


176 


48 
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Score 


1 0/ 

/o 

Identity 






mammaTV 

tumor vims 








304 


gi6911288 


Exogenous 
mouse 

mammarv 

tumor virus 


fiatr-Pro-Pnl 


17/; 


**o 


305 


eil4269502 


Homo saniertQ 


linrnnvpntirtnnl tnvncin 1 CI valine form 
UXlwUllVGlXlXUXUU lUjrUMll 1 VJ ValiQC XOIIlX 

(MYOIG) mRNA, MYOIG-V allele, 

l^ox Uol MU ( 






305 


gil4269504 


Homo sapiens 


unconventional myosin 1G methonine 

form fMYOIOi mRNA TV/fYniO-A/r 

allele, partial cds. 


3266 


97 


305 


ei3724141 


Rattus 
norvegicus 


rrtvncfn T 


^nn 


D/ 


306 


ci2145060 


Homo saniens 


x xxx ixiicxavuxxg peptide X»V XXXCVlN/Y, 

partial cds. 




oo 
yy 


306 


ci2224593 


TTomn canipTiQ 

AXVSJIUU vHUlvUd 


Tinman niPKA fr\r TTTA L(W)(\ rr*»r»*» 

xxuinaii iut\-Li/\ 10* gene, 
partial cds. 


O**o 


io 


306 


gi488555 


Homo sanierK 


Wit man sriitr fitiffpr nrntmn V M t? 1 "3 ^ 
llUHiail mills XXUgCX piUlCLU ZjI/Nx IjJ 

mRNA comnlete ed<3 


^on 


/to 


307 


gil31838S3 


Homo sapiens 


PD-l-ligand 2 protein (PDL2) mRNA, 
complete cds. 


1417 


99 


307 


gil3569410 


Homo saniens 


hlitvrrmVnltn tirpmircrw R7_HP ttiT?\T A 
umjfiupunin picvUxoUi O / ~LJKs lHlSdriJ\f 

complete cds. 


i a\ n 


oo 

yy 


307 


AAE01352 


Homo saniens 


Human opnp 1 pw/vIajI cA/*t*A+Ar1 

protein HDPPA04, SEQ ID NO:74. 


141 £ 
HID 


oo 
yy 


308 


AAB87436 


Homo sapiens 


Human ffene 22 pnrn/foH cprr^tp/1 

protein fragment, SEQ ID NO:177. 


JO J 


inn 
LUU 


308 


AAB94868 


Homo sapiens 


Human nrotein senu^ncp SKO m 
NO: 16072. 


30 J 


inn 


308 


gil0436314 


Homo sapiens 


cDNAFU13984 fis, clone 
Y79AA1001846. 


383 


100 


309 


AAY85025 


Homo saniens 


Hlimiin nQnV amino o^i/l ronnon/>a 

xxuxxioxx xvats,£ aimnu avxu oCCflxCnCC. 


9A/C 


JO 


309 


gi4678734 


Homo sapiens 


Human gene fromPACs 37M17 and 

^OSRIfi rhrnmr»cnmp Y ctmitor -tr* 
JvJOlU) V1XL\JJXUK>(J1XXV ^V, olxxxlLar LU 

small G proteins, especially RAP-2A. 


206 


33 


309 


AAM00956 


Homo saniens 


Human none marrow nrntpin TTi 

XAIXUXOXX UUUK 1XJA1JLVJ W pXutCXlX, OCtVJ 11/ 

NO: 432. 




JZ 


310 


gi36905 


Homo saniens 


Human mPMA fVir T'—taII n>^imtAr 

xxuxixaxi ixixviN/i. xtu i -ten reocpior 
alpha-chain HAP50 V(a)8.2-J(a)M. 


<on 


ion 


310 


gil223888 


synthetic 
construct 


T cell receptor alpha chain 


586 


100 


310 


gi2358036 


Homo sapiens 


T-cell receptor alpha delta locus from 
bases 250472 to 501670 (section 2 of 
5) of the Complete Nucleotide 
Sequence. 


586 


100 


311 


AAE01596 


Homo sapiens 


Human gene 13 encoded secreted 
protein HCLCJ15, SEQ ID NO:146. 


1066 


92 


311 


AAE04136 


Homo sapiens 


Human gene 6 encoded secreted 
protein HCLBW50, SEQ ID NO:123. 


1066 


92 


311 


gi31135 


Homo sapiens 


Rsapiens mRNA for elongation factor 
1-beta. 


1066 


92 , 


312 


gi7243137 


Homo sapiens 


mRNA for KIAA1378 protein, partial 


2400 


99 



172 



WO 02/081731 



PCTAJS02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 

THoniiiv 








cds. 






312 


gil2314036 


Homo sapiens 


Human DNA sequence from clone 
RP3-383J4 on chromosome In? 4 1- 

«■/ \SS *l 1 VU VIUVUUI0UU1W lUlili X 

24.3 Contains part of a gene encoding a 
kelch motif containing oroteiiL nart of a 
novel gene encoding a protein similar 
to Asnartvl-TRNA synthetase a 
putative novel gene, a 40S ribosomal 
protein S27 (RPS27) pseudogene, 2 
CpG islands, ESTs, STSs and GSSs, 
complete sequence. 


1184 


44 


312 


gi4650844 


Homo sapiens 


mRNA for Kelch motif containing 
protein, complete cds. 


1176 


44 


313 


gi7019945 


Homo sapiens 


cDNA FLJ20079 fis, clone COL03057. 


1610 


83 


313 


gil2804721 


Homo sapiens 


clone MGC2663 IMAGE-3543910 
mRNA, complete cds. 






313 


AAB43912 


Homo sapiens 


Human cancer associated nrntpin 
sequence SEQ ED NO: 1357. 






314 


AAB41414 


Homo sapiens 


Human ORFX ORF1 178 polypeptide 
sequence SEQ ID NO:2356. 


5094 


97 I 


314 


gi6329897 


Homo sapiens 


mRNA for KIAA1 137 protein, partial 
cds. 


4798 


98 


314 


gil4043759 


Homo sapiens 


clone IMAGE:41 1 1596, mRNA, 
partial cds. 


3906 


98 [ 


315 


AAB28375 


Homo sapiens 


Human hyperpolarisation-activated 
channel HAC3. 


3686 


99 


315 


gi7959337 


Homo sapiens 


mRNA for KIAA1535 protein, partial 
cds. 


3665 


99 


315 


gi3242244 


Mus museums 


hyperpolarization-activated cation 
channel, HAC3 


3556 


96 


316 


gil4198399 


Mus musculus 


RIKEN cDNA 1500034J20 gene 


837 


93 


316 


gil2854536 


Mus musculus 


putative 


837 


93 


316 


gil4250857 


Homo sapiens 


Human DNA sequence from clone 

RP5-1137017 on ehmmn^nmp 1 1r»1 9_ 
14.2 Contains nart of a eerie similar tn 
putative mitochondrialninner 
membrane protease subnunit 2, a novel 
mRNA, ESTs and GSSs, complete 
sequence. 


775 


100 


317 


gil0439850 


Homo sapiens 


cDNA:FLJ23233 fis, clone 
CAS00458. 


1081 


50 


317 


gi9968290 


Homo sapiens 


mRNA for zinc finger protein (ZNF304 
gene). 


1039 


48 


317 


gil4249844 


Homo sapiens 


Similar to hypothetical protein 
FU23233, clone MGG14876 
IMAGE:3544044, mRNA, complete 
cds. 


1037 


47 


318 


gil 1863686 


Mus musculus 


neuTobeachin 


3371 


96 


318 


gil 1863539 


Gallus pallus 


neurobeachin 


2100 


89 


318 


AAB92596 


Homo sapiens 


Human protein sequence SEQ ID 
NO:10843. 


1721 


100 


319 


gi!2698174 


Macaca 
fascicularis 


hypothetical protein 


1221 


95 
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319 


gil0439153 


Homo sapiens 


cDNA: FLJ22672 fis, clone HSI09265. 


1085 


99 


319 


gi7020125 


Homo sapiens 


cDNA FU20190 fis, clone COLF0714. 


893 


50 


320 


gi2865219 


Homo sapiens 


intpPTin bin dint* orotein fTjplI i 
mRNA. comnlete cds 


447 


inn 


320 


AAW94685 


Homo sapiens 


Human Del-1 protein. 


438 


98 


320 


AAW10365 


Homo RflnietiQ 


xiuiiian uc vciupixiPixuiii Y^IPgu la ICQ 

endothelial cell locus-1 protein. 


A1 ft 


Oft 

70 


321 


AAB27246 


HomO SflTllCTlQ 


Human FVM"ATV?4 WO TH MO- 74 


/Art 1 


1UU 


321 


AAB42385 


Homo sapiens 


Human ORFX ORF2149 polypeptide 


2047 


100 


321 


gi52998 


Mus musculus 


macrophage mannose receptor 


164 


31 


322 


eil2834087 


A/fnc mncr»ii1nc 


puuiuvc 


14j0 




322 


ci2463628 


TTomn oam'pnc 

LMAJxMJxf QdUlvlla 


nuiiiau putative monocarooxyiaie 
transporter (MCT) mRNA, complete 
cds. 


DUO 




322 


gi2198807 


Gallus gallus 


monocarboxylate transporter 3 


473 


27 


323 


gil5620909 


Homo sapiens 


mRNA for KIAA1925 protein, partial 
cds. 


1059 


38 


323 


AAB92496 


Homo sapiens 


Human protein sequence SEQ ID 


1050 


36 


323 




XXUILIU aaplCuS 


cujn/y rLJiuuoo ns, Clone 
HEMBA1001455. 


1050 


36 


324 


ei9651075 


ivj.ava.uu 

fascicularis 




i/lo 


95 


324 




Olio SwlUIa 


basic pro line-rich protein 


222 


26 


324 


gi59 17666 


Zeamays 


extens in-like protein 


195 


25 


325 




iiomo sapiens 


Human DNA sequence from clone 
RP3-402N21 on chromosome 6p21.1- 

.3 1. contains up to tnree novel genes 
with !MAM and nximunoglobulin 

Hntntifnc 0/\nfatne PCTo CTCo rjCCt, 

and four outativfi PnG ic1anri« 
complete sequence. 


1474 


100 


325 


gil 2836077 


Mus musculus 


■nutafive 




o^ 


325 


AAE00586 


Homo sapiens 


Human nuclear cpII adhpcinn mnlpnilja 
homolotnie NC!AM d 1 nmteiti 




40 


326 


gil5278193 


Homo sapiens 


MAGI-1C beta mRNA, complete cds, 
alternatively spliced. 


1492 


100 


326 


gi2702351 


Mus musculus 


putative membrane-associated 
guanylate kinase 1 


1112 


83 


326 


gi5817255 


Homo sapiens 


mRNA; cDNA DKFZp434B203 (from 
clone DKFZp434B203); partial cds. 


739 


100 


327 


AAB01432 


Homo sapiens 


Human TANGO 239 (form 2). 


3675 


99 


327 


AAB01426 


Homo sapiens 


Human TANGO 239. 


2700 


100 


327 


AAB00036 


Homo sapiens 


Human TANGO 239 partial sequence. 


2483 


97 


328 


gi7243117 


Homo sapiens 


mRNA for KIAA1368 protein, partial 
cds. 


5542 


100 


328 


AAY71460 


Homo sapiens 


Human semaphorin 6A-1. 


5422 


98 


328 


gil0187891 


Homo sapiens 


unnamed protein product 


5422 


98 


329 


gil3676461 


Macaca 
fascicularis 


hypothetical protein 


2193 


75 
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Soecies 


TTlACi* ri n finn 

JL*CoCI I LI lit/ 11 




/© 

THpnHtv 
JUlcillllY 


329 


gi4589566 


Homo sapiens 


mRNA for KIAA0961 protein, 
complete cds. 


2190 


75 


329 


gi456269 


Mus musculus 
domesticus 


zinc ringer protein 30 


2073 


71 


330 


AAB94295 


Homo sapiens 


Human protein sequence SEQ ID 


3062 


99 


330 


mi 0434454 


nuiiiu sapiens 


cL/XNi\. r lj iz /oo lis, cione 
NT2RP2001576, weakly similar to 
HYPOTHETICAL 62.2 KD PROTEIN 
C4G8.12C IN CHROMOSOME I. 


oOoZ 


519 


330 


gi7291781 


Drosophila 
meianogasrer 


CG3419 gene product 


471 


32 


331 


gil2852801 


Mus musculus 


jmtative 


1185 


95 






Homo sapiens 


Human DNA sequence from clone 

DOC O/f /TC1 "2 A « „i-.. n _ _ - t _o 1 l 

Kr:>-o4or 1 3 on chromosome lp2 1.1- 
22. 1 Contains part of the PPAP2C 
(phosphatidic acid phosphatase type 2c) 
gene, ESTs, STSs and GSSs, complete 
sequence. 


975 


100 


331 




T-TrtTnrt camonc 
XlUIliU SitpiCUd 


cuina rLjzujuu us, clone j±bJruo4o->. 


74o 


56 


332 


gil2309630 


Homo sapiens 


Human DNA sequence from clone 
jvr i i*njoDZj on enromosome y 

vuiUfllua a UUVC1 gCUC JLOi a UCUiODal 

leucine-rich repeat protein, ESTs, STSs 

and CtSSq onrrmlpff* Qpmiptirp 


3138 


100 


332 


AAB31161 


Homo sapiens 


Amino acid sequence of a human 
TOLL protein. 


2600 


86 


332 


gil3444976 


Homo sapiens 


unnamed protein product 


2600 


86 


333 


ei4240145 


LMAJIXUJ M^lClU 


iuivin a ioi aj/iAi/ozo protein, parnai 
cds. 




99 


333 


gil4249936 


Homo saoiens 


Similar tn &..9HpTmc\/1Vmmrir*\sctf»'in^ 

hydrolase-like 1, clone 
IMAGE'3536052 mRNA. mrtial cds 




ion 


333 


AAW56097 


Homo sapiens 


Amino acid oefliienc.^ of fhf» ftTYTi^h 4 ? ^ 
enzyme. 






334 


gil3625385 


Homo sapiens 


EPI64 (EPI64) mRNA, complete cds. 


1026 


46 


334 


AAB95321 


Homo saoiens 


Human lrrotpin cmiipnri* TT) 

1*1*1 lift H UiUtPIH oCUUvilwC OJCA^ JUL' 

NO:17577 






334 


gil0435007 


Homo sapiens 


cDNAFLJ13130fis, clone 
NT2RP3002972, weakly similar to 
Halocynthia roretzi mRNA for HrPET- 

1- 1 


1023 


50 


335 


gil5862408 


Homo sapiens 


unnamed protein product 


2255 


95 


335 


gil3272520 


Mus musculus 


pancreatitis-induced protein 49 


2021 


85 


335 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64-1776 cDNA clone. 


1784 


95 


336 


gil5862408 


Homo sapiens 


unnamed protein product 


2281 


99 


336 


gil3272520 


Mus musculus 


pancreatitis-induced protein 49 


2047 


88 


336 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64-1776 cDNA clone. 


1810 


99 


337 


gi4545313 


Mus musculus 


prominin-like protein 


1021 


77 


337 . 


gil5042603 


Rattus 
norvegicus 


prominin 


647 


30 ; 
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337 


AAB94028 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14170. 


642 


29 


338 


gi2978255 


Mus musculus 


myeloid zinc finger protein-2 


212 


42 


338 


AAB54292 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:744. 


208 


30 


338 


gi8886436 


Homo sapiens 


myeloid zinc finger protein 1 splice 
variants (ZNF42) gene, complete cds, 
alternatively spliced 


207 


42 


339 


gi3882269 


Homo sapiens 


mRNA for KIAA0774 protein, partial 
cds. 


5974 


99 


339 


gil2860422 


Mus musculus 


putative 


692 


96 


339 


gil5424451 


Homo sapiens 


hATIP3 


606 


36 




AAB36617 


Homo sapiens 


Human FLEXHT-39 protem sequence 
SEQ ID NO:39. 


584 


100 


340 


gi8218050 


Homo sapiens 


Human DNA sequence from clone 
RP l -l 87J 1 1 on chromosome oql l . I- 
22.33. Contains the gene for a novel 
protein similar to S. pombe and S. 
cerevisiae predicted proteins, the gene 
for a novel protem similar to protein 
kinase C inhibitors, the 3' end of the 
gene iur a novel protein similar to 

proteins, ESTs, STSs, GSSs and two 

tentative fYifr islands comnlpte 

sequence. 


562 


100 


340 


gil3540300 


Mus musculus 


nucleolar protein C7B 


415 


66 


341 


gil4583268 


Homo sapiens 


cvtonlasmic nrotpin mRNA romnlptp 

J WJ^JUOMXll \s ^SlUlVUl 1XJXVJ. 1 X/KJlllLslClKs 

cds. 






341 


gi2104769 


Homo sapiens 


echinoderm inicrotiiHiiIp-flQQnptJit^H 

vwlUUUUvlUl UULlvivUWUlW (UOvwullCU 

t>rotein homoloe HiiEMAP mRNA 

ytwwiu uviuvwg mil tXYUtJL 1 1 IIVl i.ffc, 

complete cds. 




DD 


341 


gi4406218 


Homo sapiens 


e chin ode rm mierntnhiilft-assnr iatf»H 
protein-like EMAP2 mRNA, complete 
cds. 




50 


342 


AAB60099 


Homo sapiens 


Human transport protein TPPT-19. 


1616 


93 


342 


gi7294748 


Drosophila 
melanogaster 


CG7616 cene nroHnct 


580 




342 


gil47 14781 


Mus musculus 


RIKEN cDNA 261 0005 A 10 eene 


441 


15 


343 


AAB94374 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14915. 


3938 


99 


343 


gil0434690 . 


Homo sapiens 


cDNAFU12921fis, clone 
NT2RP2004600. 


3938 


99 


343 


gi5689736 


Homo sapiens 


mRNA for myopodin 


883 


34 


344 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


717 


100 


344 


gil0953950 


Geochelone 
carbonaria 


alpha-D chain hemoglobin 


407 


54 


344 


gi4455876 


Cairina 
moschata 


alpha D-globin 


398 


53 


345 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


668 


78 


345 


gil0953950 


Geochelone 


alpha-D chain hemoglobin 


359 


43 
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0/ 
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345 


gi4455876 


Cairina 
moschata 


alpha D-globin 


349 


41 






nomo sapiens 


mKNA; CUM A L/KrZpj47C17o (from 
clone DKFZp547Cl 76). 


1053 


1 AA 

100 






nomo sapiens 


numan UKrX UKr loiz polypeptide 
sequence SEQ ID NO:3624. 


840 


100 


JtU 




Drosophila 
melanogaster 


CvjIUUoo gene product 


601 


40 


347 


gil5778899 


Homo sapiens 


Similar to f-box only protein 17, clone 
MGC:ill62 IMAGE:384l90l,nutNA, 
complete cds. 


1537 


99 




giyzouuou 


Macaca 
fascicularis 


unnamed protein product 


1435 


95 






Homo sapiens 


Similar to f-box only protein 17, clone 
JVLCjC:9379 lMAGfi:3 864760, mRNA, 
complete cds. 


857 


56 


348 


AAG64860 


Homo sapiens 


Heart muscle cell differentiation related 
protein o&\i id inu: 01 . 


1079 


90 


348 


AAB99931 


Homo sapiens 


Human MesPl protein sequence SEQ 


1079 


90 ! 


348 


gil3623241 


Homo sapiens 


Similar to mesoderm posterior 1, clone 
MGC: 10676 IMAGE:3944350, mRNA, 
complete cds. 


1079 


90 


349 


<ri4?1S144 


nomo sapiens 


enromosome 19, BAC 39498 (C1T-B- 
26X23), complete sequence. 


387 


100 


349 


gi8 163824 


Homo sapiens 


krueppel-like zinc finger protein HZF2 
nuuN a, complete cos. * 


290 


74 


349 


AAY39779 


Homo sapiens 


CBMACD04 protein sequence. 


286 


71 


350 


gi7673618 


Mus musculus 


ubiquitin specific protease 


2016 


73 


350 




Homo sapiens 


rnRNA for KIAA1063 protein, partial 
cds. 


2000 


64 


350 


gil6198231 


Drosophila 
melanogaster 


LD43147p 


1188 


46 


351 


gil3540193 


Homo sapiens 


isopentenyl pyrophosphate isomerase l 
(IDIl), HT009-like protein, and | 
isopentenyl pyrophosphate isomerase 
type 2 (IDI2) genes, complete cds. 


1202 


100 




gu3925766 


Homo sapiens 


isopentenyl diphosphate dimethylallyl 
uipnospnaie isomerase z \J-Uizj gene, 
exon 4 and complete cds. 


1202 


100 


351 


gil3925769 


Homo sapiens 


isopentenyl diphosphate dimethylallyl 
diphosphate isomerase 2 (IDI2) 
mRNA, complete cds. 


1202 


100 


352 


gil3561001 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-528A10 on chromosome 6 
Contains an IMPDH1 (IMP (inosine 
monophosphate) dehydrogenase 1) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAA0161, ESTs, STSs and GSSs, 
complete sequence. 


950 


100 


352 


gi!3991706 


Mus musculus 


UbcM4-interacting protein 4 


655 


53 
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352 


• i i n £1 OA 

gil 136384 


Homo sapiens 


Human mRNA forKIAA0161 gene, 
complete cds. 


651 


53 


353 


gil3561001 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-528A10 on chromosome 6 
Contains an IMPDH1 (IMP (tnosine 
monophosphate) dehydrogenase 1) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAA0161, ESTs, STSs and GSSs, 
complete sequence. 


709 


79 


353 


gil3991706 


Mus musculus 


UbcM4-interacting protein 4 


506 


45 


353 


gil 136384 


Homo sapiens 


Human mRNA for KIAA0161 gene, 
complete cds. 


502 


44 


354 


AAB74446 


Homo sapiens 


Human protease-inhibitor like protein. 


2759 


100 


354 


gil2053227 


Homo sapiens 


mRNA; cDNA DKFZp434B044 (from 
clone DKFZp434B044); complete cds. 


2756 


99 


354 


gil5593902 


Homo sapiens 


unnamed protein product 


2743 


99 


355 


AAB94358 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14883. 


1788 


98 


355 


gil0434632 


Homo sapiens 


cDNA FLJ12886 fis, clone 
NT2RP2004041, weakly similar to 
SYNAPSINS IA AND IB. 


1788 


98 


355 


gil2052738 


Homo sapiens 


mRNA; cDNA DKFZp564H1322 
(from clone DKF2£564H1322); 
complete cds. 


1788 


98 


.330 


gll343o437 


TT * 

Homo sapiens 


Similar to RKEN cDNA 5730438N18 
gene, clone MGC:4399 

IMAGE:2905957, mRNA, complete 

_ j_ 

COS. 


1634 


99 


356 


gil5030091 


Mus musculus 


Similar to RKEN cDNA 5730438N18 
gene 


1508 


91 


356 


AAB43372 


Homo sapiens 


Human ORFX ORF3136 polypeptide 
sequence SEQ ID NO:6272. 


1464 


91 


357 


AAB73511 


Homo sapiens 


Human transferase HTFS-18, SEQ ID 
NO: 18. 


1880 


99 


357 


AAG74560 


Homo sapiens 


Human colon cancer antigen protein 
SEQ ID NO:5324. 


450 


98 


357 


AAG02792 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
6873. 


324 


96 


358 


ei7673618 


ft/flic mil cpi line 


uoiquiun specinc protease 


1*71 1 

2711 


95 


358 


gi5689463 


Homo sapiens 


mRNA for KIAA1063 protein, partial 
cds. 


2382 


78 


358 


gi5823525 


Drosophila 
melanogaster 


ubiquitin-specific protease nonstop 


1305 


49 


359 


AAB94775 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15864. 


1022 


100 


359 


gil0435984 


Homo sapiens 


cDNA FLJ13842 fis, clone 
THYROI000793. 


1022 


100 


359 


gi2340162 


Xenopus 
laevis 


dsRBP-ZFa 


380 


44 


360 


gi3676086 


bacteriophage 
PS119 


gpl9 


291 


59 


360 


gil778468 


Escherichia 


hypothetical protein 


287 


59 
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coli 








360 


gil786768 


Escherichia 
coliK12 


bacteriophage lambdfl lysozyme 
homolog 


287 


59 


361 


gil3544003 


Homo sapiens 


clone IMAGE:3677165, mRNA, 
partial cds. 


2172 


88 


361 


gi3 169073 


Schizosacchar 
omyces pombe 


phenylalanyl-troa synthetase, 
mitochondrial precursor 


233 


33 


361 


gil3877969 


Arabidopsis 
thaliana 


outative Dhenvlalanine-tRNA 
synthetase 


228 




362 


gi293694 


Musmusculus 


laminin receptor 


370 


49 


362 


gil3277921 


Mus mus cuius 


laiuinin receotor 1 f67kD rthosnmal 
protein SA) 


367 




362 


gi4633839 


Mus musculus 


37kDa oncofetal antiopn 


367 


dO 


363 


gil5082271 


Homo saniens 


testes HevelfiTwneTit-TplatpH NYTi-SP9 1 

mRNA, complete cds. 


lO/O 


inn 


363 


gi6807923 


Homo sapiens 


mRNA: cDNA DKFZn434H092 ffrom 
clone DKF2b434H092V Dartial cds 


J. UZiU 




363 


gi7294427 


Drosophila 
melanogaster 


CG8797 gene product 


118 


21 


364 


AAE01355 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43, SEQ ID NO:77, 


2724 


97 


364 


gil2836042 


Mus musculus 


outative 




03 


364 


AAE01380 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43, SEQ ID NO.102. 


2500 


97 


365 


gil0439688 


Homo sapiens 


cDNA: FLJ23109 fis, clone 
LNG07754. 


2809 


99 


365 


gi9622093 


Mus muscuhiq 


T^-caHVlftTin ViinHrnc nrntpi'n T?7 


Z/05 


G*7 


365 


AAG01765 


Homo saniens 


Wliman ^pcrptpfi rvmfpirt TTl MH- 

5846. 


/j / 


OQ 


366 . 


^il2854995 


Mus musculus 


nutativp 




71 
/I 


366 


gil0241691 


Homo sapiens 


Novel human gene mapping to 

chcmnsnme 99 


791 


99 


366 


gil4602790 


Homo sapiens 


DKFZP566F0546 protein, clone 

MfiT'9444 TM AnF»9«99S7n m RWA 

complete cds. 


791 


99 


367 


gil5082283 


Homo saniens 


Similar tA smnll crlntnmfrip-rirVi 

tetratricopeptide repeat (TPR)- | 
containing, clone MOC: 10496 
EMAG6*362S A 93 mRNA cnrnnlpfp 

UVJ^lXJlv. JVA£r<J77«Sj llUVlN-TV, WJLUlJlCtC 

cds. 




mo 
1UU 


367 


gi3377591 


Homo sapiens 


full length insert cDNA YN88E09. 


592 


100 


367 


gi!5488015 


Homo sapiens 


TPR<ontaining co-chaperone mRNA, 
complete cds. 


450 


64 


368 


gi9104819 


Xylella 
fastidiosa 9a5c 


hypothetical protein 


151 


43 


368 


AAY59981 


Homo sapiens 


Human endometrium tumour EST 
encoded protein 41. 


128 


46 


368 


AAE03351 


Homo sapiens 


Human gene 4 encoded secreted 
protein fragment, SEQ ID NO: 126. 


121 


58 


369 


gi5817053 


Homo sapiens 


mRNA; cDNA DKFZp586D0824 
(from clone DKF^)586D0824); partial 
cds. 


571 


43 


369 | 


gi!5530285 


Homo sapiens 


clone MGC:24275 IMAGE:3950542, 


571 


43 
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mRNA, complete cds. 










1V1U8 IIXUS CUIUS 


immunity-associated nucleotide 4 


C A A 

540 


42 


370 


gi8453103 


Homo sapiens 


zinc finger protein mRNA, complete 
cds. 


1296 


58 


V7ft 


oil ^01^170 


noxno sapiens 

— — ; : 


zinc nnger protein 10 (JvUX yj, clone 
MGC:15145 IMAGE:3949487, mRNA, 
complete cds. 


1296 


CO 

58 


J / V 




Homo sapiens 


ri.sapiens iiZiuU rnKNA lor zinc 
finger protein. 


1279 


55 


371 




nomo sapiens 


Similar to hypothetical protein 
rLJiu/uz, cione iYiuu.zii04 
uvi/wjc'fjyiozi, nn\iN a, complete 
cds. 


973 


100 


371 


AAB42336 




Human HDPY PVDI79 1 0A T\nl«m«mfirl» 

. Jim iinn -v/x\j.A. vjisjrz iuu poiypepuQe 
sequence SEQ ID NO:4200. 




y$ 


371 


AAB93080 


Haiti n Qanipnc 

IXVJIXLU OOjpXCilO 


jwmau protein sequence oiiv^ un 
NO:11912. 






372 


ei7328451 


ft/flic mncnilnc 


C14lt/1QC*0 

olaUQaSc 




A A 

44 


372 


AAB93971 


Homo sapiens 


Human protein sequence SEQ ID 


866 


42 


372 


AAW73964 


Homo sapiens 


Human sialidase protein sequence. 


866 


42 


373 


ei 1480005 


A/fno miiopnliic 
lYXUa JulUoUUiUO 


Z/icn protein 


I4yu 


86 


373 


AAB 14349 


Homo sapiens 


Human Zicl protein. 


1102 


67 


373 


oil 208429 


XXUIllu bapicns 


nu\jNA ior ZAc protein, complete cos. 


1102 


67 


374 


gil2860114 


Mus musculus 


putative 


876 


40 


374 


gil61958 


Trypanosoma 
cruzi 


surface antigen 


177 


23 


374 




Xenopus 
laevis 


APEG precursor protein 


174 


26 


375 


AAY99349 


Homo sapiens 


Human PROl 1 10 (UNQ553) arnino 
acid sequence SEQ ID NO:3 1. 


1683 


100 


375 


AAB19729 


Homo sapiens 


Human SECX Clone 4339264-2 
encoded protein. 


1683 


100 


375 


AAB15549 


Homo sapiens 


Human immune system molecule from 
Incyte clone 2774913. 


1683 


100 


376 


£12746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 

//'II jy T«^\ i\\t A . . . 

(CELF4) mRNA, complete cds. 


936 


100 


376 


gil3278792 


Homo sapiens 


Bruno (Drosophila) -like 4, RNA 
binding protein, clone MGC:2693 
iivi/vvrn.z5zujHi, niLviN/v, complete 
cds. 


911 


98 


376 


gil2804985 


Homo sapiens 


Similar to etrl, clone MGC:4320 
IMAGE:2820541, mRNA, complete 
cds. 


911 


98 


377 


gil2746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 
(CELF4) mRNA, complete cds. 


905 


89 


377 


gil3278792 


Homo sapiens 


Bruno (Drosophila) -like 4, RNA 
binding protein, clone MGC:2693 
HMAGE:2820541, mRNA, complete 
cds. 


880 


88 


377 


gil2804985 


Homo sapiens 


Similar to etrl, clone MGC:4320 
IMAGE:282054i, mRNA, complete 
cds. 


880 


88 
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o /o 


gllZot 1UOU 


Mus musculus 


putative 


O AA 

809 


75 


378 


£7293285 


Drosophila 
melanogastcr 


CG4768 gene product 


239 


37 


378 


gi!938566 


Caenorhabditis 
elegans 


Hypothetical protein C48B6.3 


123 


38 


379 


gi3880385 


Caenorhabditis 
elegans 


predicted using Genefrnder-^ntains 
similarity to Pram domain: FF01484 
(Nematode cuticle collagen N-teiminal 
domain), ocore=51.5, E-value^o.le-lz, 
JN— 1~CU1NA nM yKy4a4.j comes from 
uus gene~cuiNA doi yKy*fa4.j comes 
from this gene-cDNA EST yk68dl .5 
conies uuixi uiis gcne* w CL^iN/v no i 

vfc£RH1 ^ mmM from thic opn^ 
jfrwvou-i .^j u viiu mia gone 


79 


35 


379 


gi6684 


Caen ArhaH/li tic 
elegans 


ii7iTiflmf»/1 nrrtfnn nr/vlurf* 


70 
/y 


JJ 


379 


ei 156262 


Vcawxiui ixau mm 

elegans 


l/UxiagCxl 


7Q 


75 


380 


AAB85365 


xxujlxiu iM|;iwiU> 


iNuvt/i v on w liieozonu/ixuoroDosporin- 
like mature protein sequence. 


£^7 
03/ 




380 


AAB85364 


Hornn saiuetis 


MrtVpl Vati Will aV» ran H/fVimmKrtcmrvriT*- 

i>uvci vviii w mcuiaiiu/ uxiuuiuuopuriii* 
like nolvnentfdf* % 


03/ 


Oil 


380 


gil2836633 


Mus musculus 


putative 


651 


59 


381 


eil 5024264 


Mus musculus 


ribncnmnl rvmtfin T 

llUUOUUial pXUlvlll 1jJJ4 


101 

171 




381 


gi57119 


Rattus 
norvepicus 


ribosomal protein L35a (aa 1-1 10) 


191 


53 


381 


gil2846322 


Mus musculus 


putative 


191 


53 


382 


gil2835133 


Mus musculus 


puuiuvv 


01 / 


71 
/I 


382 


gi7293113 


Drosophila 

rnpl a Tirj o a Qtf*r 
jJXwicuxisgadtC/i 


CG12379 gene product 


283 


72 ; 


382 


gi6042159 


Caenorhabditis 

elegans 

VXwgflXXo 


Hypothetical protein F53A3.7 


226 


55 


383 


AAB81053 


XlXJLLLVf oCU/XCXld 


XlUUIaQ piVJLClil xxx UlOHv allUTlO aClQ 

seauence 


0^9 


1 AA 


383 


gil2841896 


Mus musculus 


putative 


925 


98 


383 


ei7303144 


Drnsnnriila 
x/ivduuuua 

melanogaster 




£19 
01Z 


OJ 


384 


cil 0440373 


Homn saniens 

XXISX1U/ uuU 1 wiiO 


mPMA fnr FT TfifMY)9 nrntPtn martial 
xux rwvut/M piutcixx, paiticix 

cvlfi 




07 


384 


gil0440396 


Homo sapiens 


mRNA for FU00031 protein, partial 
cds. 


647 


88 


384 


gil086626 


Caenorhabditis 
elegans 


Hypothetical protein C06A63 


273 


33 


385 


gil2053305 


Homo sapiens 


mRNA; cDNA DKFZp434G099 (from 
clone DKFZp434G099); complete cds. 


1210 


100 


385 


gi2516239 


Mus musculus 


Rab33B 


1138 


94 


385 


gil2836564 


Mus musculus 


putative 


1138 


94 


386 


gi7243247 


Homo sapiens 


mRNA for KIAA1433 protein, partial ! 
cds. 


3232 


100 


386 


AAB94053 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14222. 


3223 


99 


386 


gil3096872 


Mus musculus 


Unknown (protein for MGC:7720) 


2906 


89 


387 


gil4599491 


Homo sapiens 


small proline-rich protein 2F (SPRR2F) 


458 


100 
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gene, complete cds. 






387 


gil4599489 


Homo sapiens 


small proline-rich protein 2E 
(SPRR2E) gene, complete cds. 


444 


95 


387 


gi338423 


Homo sapiens 


Human small proline rich protein 
(sprH) mRNA, clone 930. 


434 


94 


3 00 


gioO 10699 


Rattus 
norvegicus 


F-box protein FBL2 


1449 


99 


388 


gil4043139 


Homo sapiens 


RIKEN cDNA 261051 1F20 gene, 
clone MGC:15482 IMAGE:2987858, 
mRNA, complete cds. 


1383 


100 


388 


gil2848653 


Mus museums 


putative 


1371 


99 


389 


gi2853265 


Rattus 
norvegicus 


jun dimerization protein 2 


800 


96 


389 


gil2248392 


Mus museums 


transcriptional inhibitory factor 


795 


95 


389 


gi6648146 


Homo sapiens 


chromosome 14 clone CTD-2317F5 
map 14q24.3, complete sequence. 


481 


100 


390 


gil5277240 


Homo sapiens 


genomic DNA, chromosome 6p21.3, 
HLA Class I region, section 17/20. 


1296 


100 


390 


gil 1875405 


Homo sapiens 


HZFwl protein mRNA, complete cds. 


1291 


99 


390 


gil 1875407 


Homo sapiens 


HZFw2 protein mRNA, complete cds. 


773 


99 


391 


gi6572201 


Homo sapiens 


Human DNA sequence from clone 
CITF22-27C3 on chromosome 
22ql3. 1-13.31 Contains a gene for a 
novel protein (D Jl 163 Jl .2) and part of 
a gene for a novel protein (DJ1 163J1.3, 
similar to mouse B99), ESTs, STSs and 
GSSs, complete sequence. 


863 


100 


391 


gi4469186 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 163J1 on chromosome 22ql3.2- 
1333 Contains me 3' part of a gene for 
a novel KIAA0279 LIKE EGF-like 
domain containing protein (similar to 
mouse Celsrl, rat MEGF2), a novel 
gene for a protein similar to C. elegans 
B0035.16 and bacterial tRNA (5- 
Me%larrjmomemyl-2-thiouridylate> ' 
Methyltransferases, and the 3' part of a 
novel gene for a protein similar to 
mouse B99. Contains ESTs, GSSs and 
putative CpG islands, complete 
sequence. 


863 


100 


391 


AAB92551 


Homo sapiens 


Human protein sequence SEQ ID 
NO:10735. 


862 


96 


392 


gi5001720 


Mus musculus 


odd-skipped related 1 protein 


1413 


97 


392 


gil5778246 


Mus musculus 


odd-skipped related 2 


924 


66 


392 


gil5488723 


Mus musculus 


Unknown (protein for MGC: 19171) 


924 


66 


393 


AAB94364 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14895. 


2700 


99 


393 


gil0434650 


Homo sapiens 


cDNA FUJI 2895 fis, clone 
NT2RP2004187, weakly similar to 
ZINC FINGER PROTEIN 38. 


2700 


99 


393 


gil3623217 


Homo sapiens 


Similar to hypothetical protein 
FLT12895, clone IMAGE:3533093, 


2150 


99 
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% 
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mRNA, partial cds. 






394 


gil2053105 


Homo sapiens 


mRNA; cDNA DKFZp434Kl 1 1 (from 
clone DKFZp434Kl 1 1); complete cds. 


3116 


100 


394 


gi2282582 


Mus rausculus 


actm-binding protein 


2402 


74 


394 


AAR94386 


Homo sapiens 


TTliman tipittsiI f.f»l1 trmffin mnrlrpr 

llUlllflU U&UJKU vvll VXKJVCtxX I " '*! 1 IS Vl 

RR/B. 




1A 
IH 


395 


gi207145 


Rattus 
norvegicus 


synaptotagmin II 


2128 


95 


395 


ei7739733 


lMllQ TYIHCflllllC 
IVXUo IXlLLowUlUa 


byoapioiagunn n 


21/1 


95 


395 


gi688412 


Mus musculus 


synaptotagminILIP4BP 


2121 


95 


396 


ril 5487674 ' 


XXlUliU oapiCDo 


\JoDr -reiaiea protein i mKJN A, 
complete cds. 


Oil A 

3220 


99 


396 


AAB9261 1 


\-\ f\m f\ eoniAne 
rxuiliu oapiCLu> 


riuman protein sequence oHv^ ID 
NO: 10880. 


703 


100 


396 


AAYQ7701 


rxotno Sapiens 


Lvipia associated protein (LXrAJrJ 
2764333CD1. 


703 


100 


397 


gXl li-J lUOJ 


lviacaca 
fascicularis 


hypothetical protein 


490 


76 


397 


gi2447128 


Paramecium 

nnivon a 

fThlmvtllfl vimc 
viuuicua V u Ua 

1 


contains 10 ankyrin-like repeats; 
similar to human ankyrin, corresponds 
to owiss-iToi Accession JNumDer 


212 


33 


397 


gi6634025 


Homo sapiens 


mRNA for KIAA0379 protein, partial 
cds. 


203 


38 


398 


AAB21047 


XXULXiSJ OuX/lCXlO 


nuiuau nucieic aciu-Dinoing protein, 


lUo2 


100 


398 


gi833629 


Xenopus 
laevis 


nucieoplasrnin 


459 


49. 


398 • 


gi64940 


Xenopus 
laevis 


niiclefmlflQmin ( A A 1 -OOTft 


•to J 


40 


399 


gi!5919272 


Homo sapiens 


putative forkhead/wmged-helix 
transcription factor fROVP?^ ml? "MA 

complete cds. 


596 


84 


399 


gi2565057 


Homo sapiens 


CAGH44 mRNA. oartial cds 




OH 


399 


gil4582802 


Mus musculus 


foikhead-related transcription factor 2 


588 


82 


400 


AAB08199 


Homo sapiens 


Amino acid ^eouencp of Vinmnn 

* mill mv ftviu iivUUvUvv V/A IHllfflfr * 

diacylglycerol kinase beta 
(DAGKbeta^ 




0Q 


400 


gil0279722 


Homo sapiens 


unnamed protein product 


4217 


99 


400 


gi485398 


Rattus 
norvegicus 


90kDa-diacylglycerol kinase 


4046 


95 


401 


gi7670446 


Mus musculus 


unnamed protein product 


1295 


87 


401 


gil3 185203 


Homo sapiens 


unnamed protein product 


799 


83 


401 


AAY31642 


Homo sapiens 


Human transport-associated protein-4 
CTRANP-4). 


466 


35 


402 


gil2837990 


Mus musculus 


putative 


985 


69 


402 


gi5668737 


Mus musculus 


UBE-lc2 


661 


50 


402 


AAB94645 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15538. 


426 


52 


403 


gil0439821 


Homo sapiens 


cDNA: FLJ23209 fis, clone 
ADSH00512. 


2596 


99 


403 


gil0440353 


Homo sapiens 


mRNA for FLJ0001 1 protein, partial 


1448 


97 
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Accession No. 
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Description 
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cds. 






403 


gi8217420 

— . - — 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein similar to rat tricarboxylate 
carrier, the gene for a novel PDZ 
(DHR, GLGF) domain protein, the 
gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, the gene for a 
. J^py^proteii^miilar to Plasmodium 
>OMl and C. elegans F46G1U, a 
putative novel gene, the SEMA4G gene 
for semaphorin 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative Q>G islands, complete 
sequence. 


1026 


100 


404 


AAB42219 


Homo sapiens 


Human ORFX ORF1983 polypeptide 
sequence SEQ ID NO:3966. 


2230 


96 


404 


gi3417297 


JlXVLLlV/ bapiCllb 


rtuman v^nromosome lo BAvJ clone 
CIT987SK-A-635H12, complete 

dequeues. 


2230 


96 


404 


gi!5559282 


Homo sapiens 


clone MGC:20208 IMAGE:3936339, 

TTlT? 7\T A r > nrrrr»lff/=» e»Ao 
uuvii^rk, vuiijpiClw COS. 


1021 


53 


405 


gil3365905 


Macaca 
fascicularis 


hypothetical protein 


1154 


99 


405 


AAB15537 


Homo sapiens 


Human immune system molecule from 


911 


100 


405 
406 


AAE04891 
_gi262843 


Homo sapiens 
Rattus sn 


Human transporter and ion channel-4 
\ jl nii^ri-qj protein, 


360 


39 


406 


gi545078 


Rattiis qt> 


uvuiuiiaiiaLLu.iicx transporter 
iNaTfv^i^-^-aepenGent neurotransmitter 

tmTivnttrtw 
uouduux ICX 


3709 
3694 


96 
96 


406 


AAR88390 


Homo sapiens 


jlxui iiflLi uwuLv^uaJioLiiiiicr uansporosr 
protein. 


iooo 


96 


407 


AAB31212 


Homo saniem 


auuuu oviu sequence ox numan 
polypeptide PRO6004. 


728 


100 


407 


AAB44331 


Homo sapiens 


Human PR04993 protein sequence 
SEQIDNO:612. 


717 


100 


407 


gi4519558 


Rattus • 
norvegicus 


Kilon 


667 


94 | 


408 


gil5277972 


Mus musculus 


Similar to DnaJ (Hsp40) homolog, 
subfamily B, member 1 


808 


49 


408 


gi7804472 


Mus musculus 


heat shock protein 40 


808 


49 


408 


AAB72675 


Homo sapiens 


Human HDJ1. 


804 


48 


409 


gil2841015 


Mus musculus 


putative 


798 


52 


409 


AAB60114 


Homo sapiens 


Human transport protein TPPT-34. 


787 


51 


409 


gi!3435410 


Mus musculus 


Similar to RKEN cDNA 1810012H11 
gene 


768 


53 


410 


gi488555 


Homo sapiens 


Human zinc finger protein ZNF135 


1241 


52 
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Accession No. 
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% 
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mRNA, complete cds. 






410 


AAY73346 


Homo sapiens 


HTRM clone 619699^rotein sequence. 


1238 


49 


410 


AAB43912 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1357. 


1231 


49 


411 


gi837292 


Rattus 
norvegicus 


S100A1 gene product 


278 


59 


411 


AAB45531 


Homo sapiens 


Human S100A1 protein. 


274 


57 


411 


gil 1228039 


Homo sapiens 


S100A1 cDNA 


274 


57 


412 


AAB19851 


Homo sapiens 


Human muscle-specific protein Ozz. 


1504 


100 


412 


gil3929456 


Homo sapiens 


Human DNA sequence from clone 
RP3-337018 on chromosome 20ql2- 
13. 1. Contains the PLPT gene encoding 
Phospholipid Transfer Protein, me 
PPGB gene coding for Lysosomal 
Protective Protein precursor (EC 
3.4.16.5, Cathepsin A, 
Carhoxypeptidase C) and the gene 
encoding peroxisomal acyl-CoA 
thioesterase (PTE1, thioesterase II), 
four novel genes, the gene for a novel 
protein similar to Drosophila 
XNeurauzea (Neu) ana the 5 end of an 
isoform of the TNNC2 gene for fast 
troponin v^z. contains tnree vjpvj 
islands, ESTs, STSs and GSSs, 


1504 


100 


412 


gi!2835750 


Mus musculus 


putative 1 


1328 


89 | 


413 


gil2847182 


Mus musculus 


putative 


875 


87 ! 


413 


ei4884173 


TTnmn canipnc 


(from rlnnp TiVV7r\<AAnf\QQ'y\' «or^o1 

um cjonc i^A-TZ/p^cw-vjuyoz partial 
cds. 


646 


100 


413 


gil0047333 


Homo <ianif*rK 


uuviN/v iur i\x/v/\iozo protein, partial 
cd<! 

vUo. 


1AC 

346 


42 


414 


gi7959343 


Homo sapiens 


mRNA for KIAA1538 protein, partial 

cds. 


3286 


100 


414 


AAB42721 


Homo sapiens 


Human ORFX ORF2485 polypeptide 

seniietire* SPO m "NO»A07fi 


382 


100 


414 


AAB42764 


Homo saoiens 


HlllTlfln OP Try OP 9 ft nnti/ru>«vh'^A 

xiuxiiou v/ma \jssjc£ jzo poiypcpncic 
seauenre SPO TO MD'^n^ 




A 1 

41 


415 


gil4043332 


Homo sapiens 


Similar to ring finger protein 23, clone 
MGC247S BvtAGE:3051389, mRNA, 
complete cds. 


1006 


43 


415 


gil0716078 


Mus musculus 


testis-abundant finger protein 


995 


42 ! 


415 


gii0716076 


Homo sapiens 


mRNA for testis-abundant finger 
protein, complete cds. 


966 


40 


416 


gi3599509 


Mus musculus 


rho/rac-interacting citron kinase 


1507 


61 


416 


gi3360512 


Rattus 
norvegicus 


Citron-K kinase 


1505 ; 


89 


416 


gi3599507 


Mus musculus 


rho/rac-interacting citron kinase short 
isoform 


1503 


89 


417 


gi2358070 


Mus musculus 


trypsinogen 1 


898 


65 


417 


gi603903 


Gallusgallus 


trypsinogen 


408 


36 


417 


gi65163 


Xenopus 


trypsin precursor 


405 


38 
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Score 


% 
laenniy 






laevis 








418 


gi440127 


Rattus 
norvegicus 


^cicurugiycdu 




5/ 


418 


AAB44256 




Human PJ?D7fK /JThiniXQ} nintem 

sequence SEQ ID NO: 109. 




40 


418 


AAY25909 


TTnmn cnniptic 

liAJLLlSJ oa^jlGu2> 


Hitman f»Dtf"VC nrnfoi'n 

nuiikiu vjjrv^o proiein. 




Ad 

46 


419 


AAM06489 


Homo sapiens 


Human foetal protein, SEQ ID NO: 
220. 


376 


82 


419 


gllz-O J JJ> / D 


Millie mitti/^iYhi/i 

lyius muscuius 


putative 


230 


31 ! 


419 


A AFfl7(K« 

A/iCvZvJ O 


Homo sapiens 


Human four disulfide core domain 
(FDCD>containingj>rotein. 


222 


31 


420 




Homo sapiens 


Human ORFX ORF2325 polypeptide 
sequence SEQ ID NO:4650. 


5075 


100 


420 


gl^*t 12*003 


Homo sapiens 


mRNA; cDNA DKFZp434N074 (from 
clone DKFZp434N074). 


5070 


99 


420 




Homo sapiens 


mRNA for KIAA0944 protein, partial 
cos. 


3375 


61 


421 


gi!0438804 


Homo sapiens 


cDNA: FU22419 fis, clone 

tiKiJUojio. 


1026 


60 


421 


gil3938187 


Homo sapiens 


hypothetical protein FLJ22419, clone 
Muuu^yuu 1MAChi:J3477o3, mRNA, 


1026 


60 


421 


gi6690339 


Mus muscuius 


hematopoietic zinc finger protein 


717 


47 


422 


AAB94721 


Hnmft cani'pnc 


riuman protein sequence o tiv^ ID 
NO:15739. 


1678 


99 


422 


&10435784 


Homo ^flnipn^ 


Wri/iNA r jjj loouj lis, Clone 
PLACE2000111. 


1078 


99 


422 


gi5706454 


Homo sapiens 


mRNA for Natural killer cell p44 
related gene 2 (NKp44RG2). 


158 


29 


423 


gil5026974 


Homo sapiens 


mRNA for obscurin (OBSCN gene). 


2713 


96 


423 


AAB95162 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17205. 


1173 


86 


423 ! 




Homo sapiens 


clone IMAGE:2961284, mRNA, 
partial cds. 


540 


26 


424 


gll«>OUl JU*T 


ivius muscuius 


putative 


523 


51 


424 


AAE02058 


Homo sapiens 


Human four disulfide core domain 
(TOCI))^ontaining protein. ! 


485 


38 


424 


gil2655452 


Homo sapiens 


mRNA for keratin associated protein 
4.7 (KRTAP4.7 gene). 


485 


40 


425 


gil2830335 


Homo sapiens 


Human DNA «uvnif»nr^ frnm rlrvnA 

RP1 1-550O8 on chromosome 20. 
Contains a novel gene encoding a 
protein kinase, an RPL7 (60S 
Ribosomal Protein L7) pseudogene, a 
CpG island, ESTs, STSs and GSSs, 
complete sequence. 


zuoz 


yy 


425 


AAB6S688 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 216. 


1732 


100 


425 


AAB65690 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 218. 


1184 


69 


426 


gi388518 


Homo sapiens 


Human Vbeta 5.5 mRNA for a new T 
cell receptor. 


627 


95 


426 


gi36173 


Homo sapiens 


H.sapiens rearranged T-cell receptor 
beta chain mRNA. 


613 


94 


426 


gil552509 


Homo sapiens 


Human germline T-cell receptor beta 


606 


100 i 
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chain TCRBV13S1, TCRBV6S8A2T, 
TCRBV5S6A3N2T, TCRBV13S6A2T, 
TCRBV6S9P, TCRBV5S3A2T, 
TCRBV13S8P, TCRBV6S3A1N1T, 
TCRBV5S2, TCRBV6S6A2T, 
TCRBV5S7P, TCRBV13S4, 
TCRBV6S2A1N1T, TCRBV5S4A2T, 
TCRBV6S4A1, TCRBV23S1A2T, 
TGRBV12S1A1N2, TCRBV21S2A2, 
TCRBV8S1, TCRBV8S2A1T, 
TCRBV8S3, TCRBV16S1A1N1, 
TCRBV24S1A3T, TCRBV25S1A2PT, 
TCRBV26S1P, TCRBV18S1, 
lUiCbVl/aiAl I, 1CKJdV2S1, 
TCRBV10S1P genes from bases 
257519 to 472940 (section 2 of 3). 






427 




xiomo sapiens 


Human beta-l,3-galactosyltransferase 
homologue, ZNSSP8. 


434 


33 


497 




Homo sapiens 


unnamed protein product 


434 


33 


427 


gil4039836 


Homo sapiens 


beta 1,3 N- 

acetyglucosarninyltransferase Lc3 
synthase mRNA, complete cds. 


434 


33 


428 




xiomo sapiens 


Human proteasome subunit LMP7 
laueie uvlt iiikjna, complete cos. 


628 


49 


428 


gi38482 


Homo sapiens 


H.sapiens gene for major 
nisiocompauDUiry complex encoded 
proteasome subunit LMP7. 


624 


49 


428 


ei!054747 


T-Trvmn coni p*n o 


xi. sapiens jl/jyia, l/iylx>, xlLA-Z/l, lrJrZ, 
TMP7 TAP1 TMP7 TAP7 nAR 

DQB2 and RING8, 9, 13 and 14 genes. 


024 


49 


429 


AAG71415 


iLXJiiMj oau lend 


niiuidu oitdciory receptor poxypepuae, 
SEOIDNO- 1096 




100 


429 


AAG71594 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEOIDNO' 1275 


1344 


83 


429 


AAG72476 


Homo sapiens 


Human OR-like polypeptide query 


1011 


100 


430 


gil0440063 


Homo sapiens 


cDNA: FU23392 fis, clone HEP17418. 


3045 


100 


430 


&15214571 


Vfim TT1113Pll1l1C 
1V1UO UBBMWI8 


wiuuiuwn ipiuicm lor 
MAGE:4207025) 


zjyo 


OA 


430 


gil770528 


Homo sapiens 


H. sapiens mRNA for translin 
associated zinc ringer protein- 1. 


687 


38 


431 


gil2859929 


Musmuscuhis 


putative 


917 


96 


431 


gil5207935 


Macaca 
fasciculaiis 


hypothetical protein 


301 


96 


431 


gil655637 


Mus mnsculus 


orf 


147 


27 


432 


gi4585414 


Bacteriophage 
933W 


hypothetical protein 


408 


42 


432 


gi4499798 


Bacteriophage 
933W 


orfl5; homologous to ninG gene 


408 


42 


432 


gi5881629 


Bacteriophage 
VT2-Sa 


hypothetical protein 


408 


42 


433 


gil3161184 


Homo sapiens 


cytochrome P450 2S1 (CYP2S1) 
mRNA, complete cds. 


2615 


100 
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433 


AAB93056 


Homo sapiens 


Human protein sequence SEQ ID 
NO-.11860. 


2527 


100 


433 


gil4042396 


Homo sapiens 


cDNA FLJ14699 fis, clone 
NT2RP2006571, moderately similar to 
CYTOCHROME P450 2G1 (EC 
1.14.14.1). 


2527 


100 


434 


gil3445575 


Homo sapiens 


facultative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


2752 


99 


434 


gil3603727 


Homo sapiens 


glucose transporter (GLUT10) mRNA, 
complete cds. 


2752 


99 


434 


gil 1065680 


Homo sapiens 


Novel human gene mapping to 
chromosome 20, similar to membrane 
transporters. 


2752 


99 


435 


gil3310486 


Homo sapiens 


C2H2 zinc finger protein (SALL3) 
gene, complete cds. 


6094 


99 


435 


gi6688241 


Homo sapiens 


SALL3 gene, exons la, 2 and 3. 


6070 


99 


435 


gil296845 


Mus museums 


spalt protein 


5089 


84 


436 


AAG71445 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1126. 


1312 


85 


436 


AAG71447 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1128. 


924 


61 


436 


gi!5293797 


Homo sapiens 


clone OR6M1 olfactory receptor gene, 
partial cds. 


829 


78 


437 


AAB65297 


Homo sapiens 


Human PR09828 protein sequence 
SEQIDNO:511. 


1360 


100 


437 


AAG89178 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
298. 


1360 


100 


437 


AAB84652 


Homo sapiens 


Amino acid sequence of fibroblast 
growth factor horaologue zFGF12. 


1360 


100 


438 


gi53756 


Mus museums 


rninopontin precursor (AA -66 to 272) t 


1521 


100 


438 


gi297546 


Mus museums 


osteopontin 


1516 


99 


438 


gi50864 


Mus museums 


T lymphocyte activation protein 


1514 


99 



188 



WO 02/081731 



PCT/US02/01222 



Table 3 



SEQID 
NO: 


Database 
entry ID 


Description 


♦Results 


1 


PF00204 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00204 1 1.59 9.700e-12 426-437 


1 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 3.667e-09 33-42 


2 


BL00291 


Prion protein. 


BL00291 A 4.49 8.759e-09 185-220 


3 


PF01105 


emp24/gp25L/p24 family. 


PF01105B 25.12 1.000e-40 178-230 


4 


BL00307 


Legume lectins beta-chain proteins. 


BL00307G9.91 8.531e-10 678-689 


4 


PF00922 


Vesiculovirus phosphoprotein. 


PF00922A 19.17 8.862e-09 281-315 


6 


BL01159 


WW/rsp5/WWP domain proteins. 


BL01159 13.85 6.073e-09 61-76 


6 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G 9.65 9.167e-09 311-323 


7 


BL01159 


WW/rsp5/WWP domain proteins. 


BL01159 13.85 6.073e-09 61-76 


7 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G 9.65 9.167e-09 311-323 


9 


BL00913 


Iron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24.20 8.981e-17 170-204 
BL00913C 7.62 4.375e-ll 136-146 
BLO0913B 10.94 7.706e-l 1 86-102 


10 


BL00913 


Iron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24.20 8.981e-17 218-252 
BL00913C 7.62 4.375e-ll 184-194 
BL00913B 10.94 7.706e-ll 134-150 


11 


BL50062 


BCL2-like apoptosis inhibitors (spans 
part of BID, BH1 and BH. 


BL50062C 6.66 8.500e-l 1 349-358 


14 


BL01144 


Ribosomal protein L31e proteins. 


BL01 144 25.07 9.069e-26 78-130 


15 


PF00204 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00204 11.59 6.694e-10 355-366 


15 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 4.000e-09 485-535 


15 


BL00415 


Synapsins proteins. 


BL00415N 4.29 6.727e-12 483-527 
BL00415N 4.29 2.774e-09 1 18-600 
BL00415P 2.374.290e-09 819-855 
BL00415Q 2.23 6.534e-09 474-510 


15 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 4.500e-14 490-505 
PR00049D 0.00 2.500e-12 489-504 
PR00049D 0.00 4.000e-12 491-506 
PR00049D 0.00 8.201e-ll 488-503 
PR00049D 0.00 1.205e-10 492-507 
PR00049D 0.00 3.746e-09 487-502 
PR00049D 0.00 5.271e-09 485-500 
PR00049D 0.00 6.644e-09 493-508 


15 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 9.022e-13 471-504 
DM00215 19.43 1.458e-09 483-516 
DM00215 19.43 2.678e-09 469-502 
DM00215 19.43 5.424e-09 468-501 
DM00215 19.43 8.017e-09 470-503 
DM00215 19.43 9.085e-09 466-499 
DM00215 19.43 9.237e-09 484-517 


15 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.308e-09 116-143 ! 


15 


BL00048 


Protamine PI proteins. 


BL00048 6.39 5.263e-10 196-223 BL00048 
63 9 3.363e-09 262-289 BL00048 6.39 
9.112e-09 184-211 


17 


PR00773 


GRPE PROTEIN SIGNATURE 


PR00773D 16.14 5.922e-09 215-235 
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23 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.300e-26 600-203 
PD00930A 25.62 1.514e-16 497-523 


23 


BL50002 


Src homology 3 (SID) domain proteins 
profile. 


BL50002A 14.19 4.000e-12 727-746 


23 


PF00182 


GTPase-activator protein for Rho-like 
GTPases 


PF00182B 14.20 7333e-12 549-128 


25 


BL00375 


UDP-glycosyltransferases proteins. 


BL00375F 16.99 7.061e-35 291-336 
BL00375C 18.27 2.615e-19 126-150 
BL00375D 14.56 9.000e-17 192-220 
BL00375B 21.22 8.627e-16 67-108 
BL00375G 13.01 4.577e-13 390-430 


28 


BL01170 


Ribosomal protein L6e proteins. 


BL01 170A 12.34 9.143e-40 139-175 • 


28 


PD01457 


RIBOSOMAL PROTEIN 40S ZINC- 
FINGER METAL. 


PD01457A 16.51 9.845e-09 67-112 


29 


BL00359 


Ribosomal protein LI 1 proteins. 


BL00359B 23.07 4.231e-24 56-97 
BL00359C 22.18 6.148e-22 111-145 
BL00359A 20.66 4.000e-21 20-56 


29 


BL01108 


Ribosomal protein L24 proteins. 


BL01 108A 20.33 1.000e-08 40-73 


30 


PR00983 


CYSTEINYL-TRNA SYNTHETASE 
SIGNATURE 


PR00983D 14.16 3209e-23 270-292 
PR00983C 11.27 3.415e-21 239-258 
PR00983A 11.10 1.878e-12 75-87 


30 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 2.286e-09 314-325 


31 


PR00718 


PHOSPHOLIPASED SIGNATURE 


PR00718E8.61 1.000e-08 327-351 


32 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 6.133e-10 49-58 


33 


PF00992 


Troponin. 


PF00992A 16.67 7.972e-10 10-45 PFO0992A 
16.67 5.145e-09 17-52 PF00992A 16.67 
6.684e-09 56-91 


34 


BL01019 


ADP-ribosylation factors family 
proteins. 


BL01019A 13.20 8.000e-ll 68-108 


34 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 4.938e-20 75-98 
PR00449A 13.20 1.900e-15 34-56 
PR00449E 13.50 6.870e-15 173-196 
PR00449B 14.34 1360e-10 57-74 
PR00449D 10.79 5.364e-09 137-151 


37 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764F 16.89 7.783e-ll 204-225 


37 


DM01077 


SEX HORMONE-BINDING 
GLOBULIN. 


DM01077A 1630 l.l65e-10 43-90 


37 


BL00279 


Membrane attack complex components 
/perforin proteins. 


BL00279E37.il 9.163e-09 187-235 


38 


PR00832 


PAXILLIN SIGNATURE 


PR00832B 9.87 6.284e-10 768-792 


38 


PR00806 


VINCULIN SIGNATURE 


PR00806A 6.63 9.260e-09 766-777 


38 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.661e-15 766-781 
PR00049D 0.00 3.250e-12 764-779 
PR00049D 0.00 7.277e-ll 765-780 
PR00049D 0.00 8.786e-10 763-778 
PR00049D 0.00 9.390e-09 762-777 


40 


BL00226 


Intermediate filaments Proteins. 


BL00226D 19.10 3.172e-34 397^144 
BL00226B 23.86 5.929e-23 230-278 
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BL00226C 13.23 4.808e-21 296-327 
BL00226A 12.77 5.065e-13 129-144 
BL00226B 23.86 6.400e-10 181-229 


41 


BL00243 


Integrins beta chain cysteine-rich 
domain proteins. 


BL002431 31.77 2.014e-09 156-199 
BL002431 31.77 5.437e-09 159-202 
BL002431 31.77 5.690e-09 30-73 


41 


BL012O8 


VWFC domain proteins. 


BL01208B 15.83 5.865e-09 184-199 


41 


BL002O3 


Vertebrate metallothioneins proteins. 


BL00203 13.94 3.670e-ll 66-112 BL00203 
13.94 4.659e-l 140-86 BL00203 13.94 
7.429e-l 170-116 BL00203 13.94 9.505e-ll 
140-186 BL00203 13.94 2.723e-10 21-67 
BL00203 13.94 2.723e-10 61-107 BL00203 
13.94 3.147e-10 105-151 BL00203 13.94 ! 
4.064e-10 22-68 BL00203 13.94 5.213e-10 
161-207 BL00203 13.94 6.457e-10 26-72 
BL00203 13.94 7.032e-10 184-230 BL00203 
13.94 7.223e-10 80-126 BL00203 13.94 
9.043e-10 130-176 BL00203 13.94 1.735e- 
09 175-221 BL00203 1354 3.020e-09 150- 
196 BL00203 13.94 3.204e-09 65-111 
BL00203 13.94 3.296e-09 95-141 BL00203 
13.943.663e-09 135-181 BL00203 13.94 
5.041e-09 47-93 BL00203 1354 5.04U-09 
85-131 BL00203 13.94 5.500e-09 100-146 
BL00203 13.94 5.867e-09 126-172 BL00203 
13.94 5.959e-09 90-136 BL00203 13.94 
6.694e-09 170-216 BL00203 13.94 6.878e- 
09 151-197 BL00203 13.94 6.969e-09 17-63 
BL00203 13.94 7.337e-09 115-161 BL00203 
13.94 7.429e-09 71-117 BL00203 13.94 
7.704e-09 171-217 BL00203 13.94 8.531e- ! 
09 155-201 BL00203 13.94 8.7l4e49 165- 
211 BL00203 13.94 9.265e-09 116-162 


41 


BL00269 


Mammalian defensins proteins. 


BL00269C 16.52 9.289e-09 28-57 
BL00269C 16.52 9289e-09 72-101 ! 


41 


PD02283 


PROTEIN SPORULATION REPEAT 
PRECU. 


PD02283C 17.54 5.050e-09 138-166 
PD02283C 17.54 5.175e-09 24-52 
PD02283C 17.54 5.175e-09 68-96 
PD02283C 17.54 6.738e-09 1 13-141 
PD02283C 17.54 7.188e-09 163-191 
PD02283C 17.54 7.750e-09 173-201 
PD02283C 17.54 7.975e-09 128-156 
PD02283C 17 .54 8.650e-09 148-176 
PD02283C 17.54 9.325e-09 1 18-146 


41 


BL00799 


Oranulins proteins. 


BL00799D 12.41 7.661e-09 49-96 
BL00799G 9.41 1.000e-08 39-80 


43 


BL00291 


Prion protein. 


BL00291A 4.49 4.414e-09 47-82 


44 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 1549-1561 


44 


BL00142 


Neutral zinc metallopeptidases, zinc- 


BL00142 8.38 2.286e-09 730-741 
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binding region proteins. 




44 


PR00480 


ASTACIN FAMILY SIGNATURE 


PR00480B 15.41 3.3l4e-09 725-744 


45 


BL00414 


Profilin proteins. 


BL00414D 15.59 9.182e-10 81-108 


48 


PR00837 


ALLERGEN V5/IPX-1 FAMILY 
SIGNATURE 


PR00837D 11.12 6.023e-09 22-36 


48 


BL01009 


Extracellular proteins SCP/Tpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009E 13.50 8.204e-09 21-37 


49 


BL00284 


Serpins proteins. 


BL00284A 15.64 2.350e-20 85-109 
BL00284D 16.34 4.240e-19 323-350 
BL00284C 28.56 5.600e-17 216-258 
BL00284E 19.15 7.500e-14 408-433 
BL00284B 17.99 9.379e-13 189-210 


50 


BL01283 


T-4>ox domain proteins. 


BL01283A 24.15 2.125e-39 148-196 
BL01283B 23.17 9.438e-34 208-250 
BL01283D 11.70 7.868e-31 298-331 
BL01283C 13.05 8.448e-16 260-274 


50 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 9.182e-26 156-181 
PR00937D 13.41 7375e-17 259-274 
PR00937B 14.58 8.615e-15 223-237 
PR00937E 11.86 8.541e-14 301-315 
PR00937F 12.53 1.450&-12 322-331 
PR00937C 10.51 1.000e-ll 240-250 


50 


PR00938 


BRACHYURY PROTEIN FAMILY 
SIGNATURE 


PR00938C 828 6.547e-09 264-282 


50 


PR00427 


BMTERLEUKIN-8 RECEPTOR 
SIGNATURE 


PR00427A 16.30 6.776e-09 416-431 


51 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270D 24.66 8.054e-09 50-86 


52 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.543e-13 181-221 


52 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 7.682e-ll 150-172 
PR00245C 7.84 5.286e-10 290-306 


52 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 3.700e-09 195-218 
PR00237G 19.63 8^35e-09 326-353 


53 


PR00050 


COLD SHOCK PROTEIN 
SIGNATURE 


PR00050A 11.28 3.143e-12 42-58 
PR00050C 9.82 9.151e-l 1 85-104 


53 


BL00352 


'Cold-shock' DNA-binding domain 
proteins. 


BL00352B 23.66 2.881e-13 71-1 10 
BL00352A 12.19 1.327e-10 42-57 


56 


BL01173 


Lipolytic enzymes G-D-X-G family, 
hisadine. 


BL01173B 13.27 4.462e-17 140-167 
BL01173C 8.984.349e-14 182-196 
BL01173A9.41 1.818e-13 454-467 
BL01 173C 8.98 6.553e-l3 495-509 
BL01173A 9.41 8.364e-13 107-120 


57 


PR00321 


GAMMA G-PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00321C 15.39 2.473e-12 123-141 


58 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 1.000e-24 117-142 
PR00937D 13.41 5.500e-18 220-235 
PR00937B 14.58 5.235e-13 184-198 
PR00937F 12.53 1.450e-12 293-302 
PR00937E 11.86 1.918e-12 259-273 
PR00937C 10.51 3.133e-ll 201-211 
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58 


BL01283 


T-box domain proteins. 


BL01283A 24.15 l.000e-40 109-157 
BL01283B 23.17 9.156e-34 169-211 
BL01283C 13.05 8.286e-17 221-235 
BL01283D 11.70 5.709e-ll 269-302 


58 


PR00938 


BRACHYURY PROTEIN FAMILY 

C 1V1XT A TT TT> U 

olWMAlUKJb 


PR00938C 8.28 7.384e-09 225-243 


59 


PD02059 


CORE POLYPROTEIN PROTEIN 
GAG CONTAINS: P. 


PD02059A 28.10 2.694e-09 116-157 


63 


TIT AAI f\£" 

BL00196 


Ribosomal protein L30 proteins. 


BL00196 34.38 3.250e-l5 46-97 


64 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 l.205e-3l 264-312 


64 


BL01305 


moaA / nifB / pqqE family proteins. 


BL01305B 10.95 8.875e-09 78-88 


68 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.727e-13 33-67 


69 


PR00874 


FUNGI-IV METALLOTfflONEIN 
SIGNATURE 


PR00874C 4.37 7.214e-10 68-83 


69 


PD00866 


GLYCOPROTEIN PROTEIN SPIKE 
E2 PRECURSOR PEPLOMER 


PD00866L 3.73 6.564e-10 1-11 PD00866L 
3.73 1.443e-09 26-36 


69 


BL00026 


Chitin recognition or binding domain 
proteins. 


BL00026 12.95 3.013e-09 48-69 


69 


DM01724 


kw ALLERGEN POLLEN CIM1 HOL- 
LL 


DM01724 8.14 3.250e-09 10-30 


69 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.838e-09 111-126 


69 


BL00243 


Integrins beta chain cysteine-rich 
domain proteins. 


BL002431 31.77 4.838e-10 106-149 
BL002431 31.77 7.221e-10 18-61 BL00243I 
31.77 1.761e-09 41-84 BL002431 31.77 
3.408e-O9 31-74 BL002431 31.77 7.465e-09 
71-114 


69 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 4.107e-13 66-1 12 BL00203 
13.94 2.138e-12 92-138 BL00203 1354 
1.099e-ll 28-74 BL00203 13.94 3.176e-ll 
82-128 BL00203 13.94 3.374e-ll 87-133 
BL00203 13.94 5.846&-1 177-123 BL00203 
13.94 7.23 le-11 102-148 BL00203 13.94 
1.670e-10 97-143 BL00203 13.94 2.532e-10 
103-149 BL00203 13.94 5.021e-10 88-134 
BL00203 13.94 7.128e-10 38-84 BL00203 
13.94 7.168e-10 107-153 BL00203 13.94 
7.702e-10 73-119 BL00203 13.94 9.426e-10 
25-71 BL00203 13.94 1.918e-09 101-147 
BL00203 13.942.745e-09 27-73 BL00203 
13.94 4.031e-09 71-117 BL00203 1354 
4.857e-09 36-82 BL00203 13.94 5.041e-O9 
98-144 BL00203 13.94 S.154e-09 6-52 
BL00203 13.94 6.418e-09 76-122 BL00203 j 
13.94 7.980e-09 91-137 BL00203 13.94 
8.255e-09 13-59 BL0O2O3 13.94 8.898e-09 
48-94 


69 


PR00876 


NEMATODE METALLOTfflONEIN 
SIGNATURE 


PR00876B 7.66 9.514e-09 80-94 


73 


PR00875 


MOLLUSC METALLOTfflONEIN 
SIGNATURE 


PR00875A 5.83 9.679e-10 17-29 
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74 


PR00185 


HISTONE H4 SIGNATURE 


PR00185B 13.68 8.888e-09 364-384 


86 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.000e-13 200-213 


86 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 6.850e-13 850-867 BL00028 . 
16.07 1.900e-10 184-201 BL00028 16.07 
6.100e-10 371-388 BL00028 16.07 6.914e- 
09 317-334 


86 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 7.158e-09 197-207 


87 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870D 15.74 8.468e-09 358-393 


88 


BL00048 


Protamine PI proteins. 


82 BL00048 6.39 5.500e-10 70-97 BL00048 
6.39 2.350e-09 62-89 BL00048 6.39 3.700e- 
09 60-87 BL00048 6.39 5.050e-09 63-90 
BL00048 6.39 6.288e-09 61-88 BL00048 
6.39 9.438e-09 71-98 


on 

89 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 8.920e-10 202-217 
PR00320B 12.19 9.486e-10 202-217 
PR00320C 13.01 7.900e-09 292-307 
PR00320A 16.74 8.902e-09 202-217 


OA 

90 


BL00453 


FKBP-type pepUdyl-prolyl cis-trans 
isomerase proteins. 


BL00453B 23.86 3.864e-28 106-140 
BL00453A 15.57 1.000e-15 81-96 
BL00453C9.72 1.000e-12 147-160 


92 


PR00299 


ALPHA CRYSTALLIN SIGNATURE 


PR00299B 17.53 7.211e-09 324-337 


93 


PF00676 


Dehydrogenase El component 


PF00676D 14.40 4.857e-13 421-441 
PF00676C 16.88 1.931e-10 389-413 
PF00676B 24.71 5.433e-10 192-230 


96 


BL00824 


Elongation factor 1 beta^eta , /delta 
chain proteins. 


BL00824B 9.21 3.919e-09 1472-1492 


99 


PR00417 


PROKARYOTIC DNA 
TOPOISOMERASE I SIGNATURE 


PR00417A 12.66 5.415e-09 866-880 


102 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.936e-29 17-56 


102 


BL00028 


Zinc ringer, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-14 435-452 BL00028 
16.07 7.353e-14 351-368 BL00028 16.07 1 
2.350e-13 295-312 BL00028 16.07 9.100e- 
13 491-508 BL00028 16.07 2.174e-12 463- 
480 BL00028 16.07 8.826e-12 211-228 
BL00028 16.07 2.038e-ll 379-396 BL00028 
16.07 2.385e-ll 323-340 BL00028 16.07 
3.423e-l 1239-256 BL00028 16.07 9.654e- 
1 1 407-424 BL00028 16.07 1.000e-10 267- ! 
284 


102 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479A 19.86 6.362e-09 366-3.89 


102 


PD02462 


PROTEIN BOLA TRANSCRIPTION 
REGULATION AC. 


PD02462A 22.48 7.695e-09 204-239 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.000e-15 46<M74 
PR00048A 10.52 1.000e-14 432-446 
PR00048A 10.52 3.250e-14 320-334 
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PR00048A 10.52 4.750e-14 348-362 
PR00048A 10.52 6.250e-14 376-390 
PR00048A 10.52 3.133e-13 292-306 
PR00048A 10.52 1.529e-12 488-502 
PR00048B 6.02 l.OOOe-11 336-346 
PR00O48B 6.02 9.308e-ll 224-234 
PR00048B 6.02 2.688e-10 476-486 
PR00048B 6.02 3.250e-10 280-290 
PR00048A 10.52 5.696e-10 404-418 
PR00048A 10.52 6.087e-10 264-278 
PR00048B 6.02 6.187e-10 420-430 
PR00048A 10.52 7.214e-10 236-250 
PR00048B 6.02 8.875e-10 364-374 
PR00048B 6.02 3.368e-09 171-181 
PR00048B 6.02 4.316e-09 448-458 


103 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.438e-37 10-49 


103 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.500e-13 413-430 BL00028 
16.07 1.000e-12 273-290 BL00028 16.07 
1.783e-12 357-374 BL00028 16.07 7.577e- 
11 301-318 BL00028 16.07 9.308e-ll 441- 
458 BL00028 16.07 9.308e-U 469-486 
BL00028 16.07 l.300e-10 329-346 


103 

• 


PR00O48 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.000e-14 354-368 
PR00048A 10.52 2.286e-13 298-312 
PR00048A 10.52 9.357e-13 270-284 
PR00048A 10.52 3.209e-12 410-424 
PR00048B 6.02 5.000e-12 286-296 
PR00048B 6.02 l.OOOe-11 342-352 
PR00048B 6.02 l.OOOe-11 370-380 
PR00048B 6.02 1.125e-10 314-324 ! 
PR00048A 10.52 2.565e-10 466-480 
PR00048A 10.52 4.522e-10 438452 
PR00048B 6.02 1.474e-09 454-464 
PR00048A 10.52 3.520e-O9 326-340 
PR00048B 6.02 4.789e-09 482-492 


103 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 289-302 PD00066 
13.92 3.769e-15 317-330 PD00066 13.92 
6.538e-15 373-386 PDO0066 13.92 2.800e- 
14 345-358 PD00066 13.92 4.600e-14 457- 
470 PD00066 13.92 4.130e-ll 401-414 
PD00066 13.92 9.654e-10 429-442 PDO0066 
13.92 5.200e-09 261-274 


103 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024H 13.88 7.353e-09 163-216 


104 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-09 325-369 


105 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-O9 379-423 


107 


PR00939 


C2HC-TYPE ZINC-FINGER 


PR00939B 13.27 3.209e-09 1302-1311 
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108 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.800e-14 279-292 PD00066 
13.92 4.600e-14 307-320 PD00066 13.92 
1.000e-13 335-348 PD00066 13.92 7.500e- 
13 363-376 


108 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.882e-14 319-336 BL00028 
16.07 7.300e-13 347-364 BL00028 16.07 
4.913e-12 291-308 BL00028 16.07 2.500e- 
10 263-280 BL00028 16.07 1.257e-09 375- 
392 


108 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e-13 288-302 j 
PR00048B 6.02 5.000e-12 304-314 
PR00048A 10.52 6.824e-12 372-386 
PR00048A 10.52 7.353e-12 344-358 
PR00048A 10.52 7.158e-ll 316-330 
PR00048B 6.02 7J231e-ll 276-286 
PR00048B 6.02 1.000e-09 332-342 
PR00048B 6.02 6.21le-09 388-398 


108 


BLOOl 15 


Eukaryotic RNA polymerase H 
heptapeptide repeat proteins. 


BLOOl 15Z 3. 12 8.842e-18 96-145 
BLOOUSZ 3.12 7.144e-17 89-138 
BLOOl 15Z 3.12 6.888e-16 103-152 
BLOOl 15Z 3.12 7.791e-15 82-131 
BLOOUSZ 3.12 3.947e-14 61-110 
BLOOl 15Z 3.12 7.292e-14 117-166 
BLOOl 15Z 3.12 9.164e-14 1 10-159 
BL00115Z 3.12 1.000e-13 75-124 
BLOOl 15Z 3.12 3.871e-13 54-103 
BLOOl 15Z 3.12 6.819e-13 68-117 
BL00115Z3.124.168e-ll 124-173 
BLOOl 15Z 3.12 9.651e-10 47-96 BLOOl 15Z 
3.12 7.485e-09 71-120 BLOOl 15Z 3.12 
9.669e-09 78-127 


109 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 S.680e-33 391-420 
PR00193C 12.60 4.789e-32 156-184 
PR00193B 11.69 1.692e-26 110-136 
PR00193E 19.47 5.500e-21 445-474 
PR00193A 15.41 4.130e-20 54-74 

nnnAl OQU 1 0 A1 C nrvi _ 1 *\ a aa Anl 

fKUUlVic 19.47 5.091e-12 444-473 


110 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BL00239B 25.15 2.985e-16 67-115 


110 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 1227 8.660e-13 132-151 


110 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 1839 4.462e-25 132-163 
BL00107B 13.31 6.143e-10 197-213 


110 


DM00406 


GLIADIN. 


DM00406 7.73 1.800e-09 818-831 


110 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 5.596e-09 815-865 


110 


BL00415 


Synapsins proteins. 


BL00415A 6.15 7.684e-09 796-837 


110 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e-09 801-834 
DM00215 19.43 7.712e-09 797-830 
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110 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 4.188e-09 817-836 
PR00209C 4.56 8.929e-09 790-804 


111 


BL00678 


Tip-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.800e-10 366-377 BL00678 
9.67 5.263e-09 417-428 BL00678 9.67 
6.21 le-09 186-197 


111 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR0O3O8C 3.83 8.892e-10 108-118 
PR00308C 3.83 8.892e-10 109-119 
PR00308C 3.83 8.364e-09 107-117 


111 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320A 16.74 4.000e-13 364-379 
PR00320B 12.19 7.923e-12 415-430 
PR00320A 16.74 5.966e-l 1 415-430 
PR00320C 13.01 7.214e-l 1 415-430 
PR00320C 13.01 9.217e-ll 364-379 
PR00320A 16.74 9.690e-ll 184-199 
PR00320B 12.19 3.057e-10 184-199 
PR00320C 13.01 6.040e-10 184-199 
PR00320B 12.19 6.657e-10 364-379 
PR00320B 12.19 1.450e-09 457-472 
rKUUJzUL/ li.Ul Z.2UUe-U9 24U-Z55 
PR00320A 16.74 4.732e-09 457-472 

rKUUJZUA 10./4 0.4ooe-0y 2o 1-2^0 

PR00320C 13.01 L000e-08 281-296 


112 


DM00547 


SHADOW GLOBAL. 


/r 2j.*» O 2. j 1 

DM00547C 17.30 7.000e-19 23-45 
DM00547D 11.602.750e-13 105-119 


112 


BL00315 


Dehydrins proteins. 


BL00315A 9.35 4.246e-10 1301-1329 


112 


PF00426 


Outer Capsid protein VP4 
(Hemag^utinin). 


PF00426S 15.67 6.438e-10 1271-1309 


112 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039D 21.67 6.793e-10 368-414 


112 


PD02191 


I ATP-BINDING NUCLEOSIDE 
TRANSCR. 


PD02191A 1355 9.036e-10 107-122 


112 


BLOOMS 


Protamine PI proteins. 


BL00048 6.39 1.900e-O9 1257-1284 
BL00048 6.39 5.050e-09 1258-1285 


112 


rru0774 


Dihydropyridine sensitive L-type 
calcium channel (Beta subuni. 


PF00774A 16.47 7.130e-09 1280-1326 
PF00774A 16.47 7.730e-09 1276-1322 


112 


BL00115 


PnVarvntif* RXTA nnlvmpmcp TT 

i^uJUUyUUU XvTN.rV IJUlYIllGXaoC JJL 

heptapeptide repeat proteins.. 


BL00115Z 3.12 3.302e-10 1261-1310 
BL00115Z 3.12 4.837e-10 1258-1307 
BL00115Z 3.12 7.767e-10 1251-1300 
BL00115Z 3.12 8.167e-10 1263-1312 
BL00115Z 3.12 8.884e-10 1260-1309 09 
1247-1296 BL001 15Z 3.12 2.985e-09 1240- 
1289 BL00115Z 3.12 5.632e-09 1265-1314 
BL00115Z 3.12 8.676e-09 1253-1302 
BL00115Z 3.12 9.471e-09 1268-1317 
BL00115Z 3.12 9.735e-09 1257-1306 


112 


PF00186 


Flocculin repeat proteins. 


PF001861 9.10 5.290e-13 1279-1309 
PF00186I9.10 6.838e-12 1277-1307 
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PF00186I9.10 2.957e-ll 1282-1312 
PF00186I9.107.496e-ll 1276-1306 
PF00186I9.105.200e-10 1268-1298 
rrOOlool 9.10 7.450e-10 1278-1308 
PF00186I9.10 7.450e-10 1280-1310 
rruOlooI 9.10 4.543e-09 12oo-129o 
PF00186I9.10 5.252e-09 1285-1315 

"D 17A A 1 0£T A 1 A H A1 1 ^ AA 1 1 *J A« 

JfrOOlBol 9.10 6.03 le-09 1272-1302 
PF00186I9.10 6.102e-09 1274-1304 

DT7AA1 O/TT A tt\n *%*ldZa. AA IOTA 1 1AA 

rrUUlooiy.10 /.23oe-OiJ 1270-1300 

"D17AA1 QCf A 1 A O A1 Co. AA 1 1 OA1 

r^uuiooiy.io o.oioe-oy 1201-1291 

Pimm ft/CT O 1 A A /ilia aa 1 *>iO 1 noi 

rruu i<$oi y.10 y.433e-oy 1202-1292 
PF001861 9.10 9.433e-09 1267-1297 

ppftfiifl/n'Q in 1 nnn^_ns io*£_ii8£ 
rruuiooi7.iu i.uuue-uo izoo-izoo 


114 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 8.788e-ll 237-256 


114 

X X"T 


SjJlWKJJOJ 


i yiubinc specinc protein pnospiiaiases 
proteins. 


DT AA1ft*il7 1 A K C OITa 1 A IvIA OC1 

131JJU3o3xl 10.33 3.32 /e-10 240-231 


116 

X XV 


PR 00884 

X XV V/ uoot 


XvJLD w O WlVi/VLf rSSXJ 1 HUN UDO 

SIGNATURE 


"DT> AAftQvIT? ft *J1 Vl *7CA** AD >MO y| j^iC 

rK00oo4c 5.32 4. / jOe-Oy 44V-4oo 


117 


PD02890 


ISOMERASE CHATjTONE— 
FLAVONONEFLAV. 


pnn9Ronp 1£ ia & 4<7<*_aq iaa 


118 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 6.513e-10 401-449 


118 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 1.925e-09 196-237 


118 


BL01160 


Kinesin ligjit chain repeat proteins. 


BL01 160B 19.54 2.678e-09 328-382 

PT fil 1 £AP 10 *ZA ft QOO^a AO AZA *7AO 
J3LU110UD 12J.34 O.Vj^e-Uy 0j4-/Uo 


119 


PD01823 


PROTEIN INimGENIC REGION 
ABC1 PRFrTJRSOR 
MITOCHONDRION T. 


PD01823C 16.13 7.000e-14 352-373 

PHA1 R9^P *\A QfL 1 IQOt* 11 lift 1/18 ' 

PD01823D 16.66 6.857e-10 430-451 


119 


PD01115 

X LJ\J X X X«/ 


PR KPT TR ^OP AMPHTOT AN WTN 

x xvL>v/ u j\o wiv xVLVxx xxxX> JLtyxn Ox\JxN 

SIGNAL. 


PTVI1 1 1 10 OO ft >nia AA *%iZQ oo*> 

rl/Ul i 1015 LZ.yZ o.42ie-Uy 20o-2o2 


122 


BL00854 


Proteasome B-type subunits proteins. 


BL00854C 29.92 8.435e-19 114-143 


124 


BL00651 


xxxi^vjqvjiluxi piuicui i_/7 pruicixio* 


DT AAiCCI A 11 K ft AnHa. 1*7 Q>! 11)1 


125 




RI01/7KW? TAilxU. familxr 
XvJH-/ x/ZjXVUJ^. j/IVxJU'tH't lainuy 

proteins. 


PT Ai *>/i^i? 1 ft *7< o ii a mi 
t>LX)iZHjr lo./j 2.373e-23 334-371 

BL01245A 14.04 8.342e-23 206-231 
BI/)1245C 13.31 6.564e-15 262-282 

1D.ZB l.UUUc-lZ JzU-JjU 

BL01245B 11.91 9.809e-10 245-255 


128 


PR00793 


PROLYL AMINOPEPTIDASE (S33) 
FAMILY SIGNATURE 


PR00793C 12.24 1.333e-09 168-183 


128 


PR00111 


ALPHA/BETA HYDROLASE FOLD 
SIGNATURE 


PR001UC 13.46 6.000e-09 182-196 


129 


BL01160 


Kinesin light chain repeat proteins. 


BL01160D 10.17 7.077e-09 505-534 


129 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 l.000e-08 695-716 


130 


BL00355 


HMG14 and HMG17 proteins. 


BL00355 5.97 8.412e-32 1849 


130 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.400e-16 34-47 PR00925A 
5.47 1.750e-15 18-33 PR00925C5.57 
9.824e-09 51-62 


131 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041B 7^0 2.976e-13 305-326 
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BINDING (CREB) PROTEIN 
SIGNATURE 




131 


BL00036 


bZIP transcription factors basic domain 
proteins. 


BL00036 9.02 4.103e-09 299-312 


132 


PR002U 


GLUTELIN SIGNATURE 


PR00211B 0.86 1.750e-09 205-226 
PR00211B 0.86 8.750e-09 199-220 


132 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.529e-l 1 201-234 

LJiyLKfKJ^LJ L7.HJ H.UJHc-lU ZUZ - ZjD 

DM00215 19.43 6J04e-10 207-240 

Dlvf 0021 5 1 0 4/* 7 4?Qf»-1 fl 1 RAJ? 1 

DM00215 19.43 8.393e-10 196-229 
DM00215 19.43 8.714e-10 218-251 
DM00215 19 43 6 034e-09 185-218 
DM00215 19.43 6.034e-09 219-252 
DM00215 19.43 6.492e-09 223-256 
DM00215 19.43 7.254e-09 200-233 
DM00215 19.43 9.390e-09 189-222 
DM00215 19.43 9.695e-09 213-246 


133 


BL00455 


Putative AMP -binding domain proteins. 


BL00455 13.31 5.125e-ll 293-309 


133 


PR00154 


AMP-BINDING SIGNATURE 


PR00154A 8.88 6.276e-09 286-298 


136 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELL SI. 


PD00015A 8.90 6.400e-09 243-251 


138 


BL00227 


Tubulin subunits alpha, beta, and 
gamma proteins. 


BL00227B 19.29 1 000e-40 52-107 
BL00227C 25.48 l.OOOe^O 113-165 
BL00227A 24.55 8.200e-36 1-35 


140 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.377e-13 60-75 PR00049D 
0.00 7.500e-10 63-78 PR00049D 0.00 
8.071e-10 61-76 


140 


PR00806 


V1NCULIN SIGNATURE 


PR00806B 428 8.440e-09 68-82 


140 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 9.553e-09 60-1 10 


141 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 6.438e-12 1 175-1 190 


141 


BL01187 


Calcium-binding EGF-like domain 
proteins partem proteins. 


BL01187B 12.04 5.800e-ll 1284-1300 
BL01187B 12.04 8.200e-ll 180-196 


141 


BL01248 


Larnmin-type EGF-like (LE) domain 
proteins. 


BL01248 11.02 4.343e-12 1362-1375 
BL01248 11.02 2.350e-l 1322-335 BL01248 
11.02 4.125e-10 271-284 


141 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764B 13.56 3.475e-09 1047-1068 


141 


PR00010 


TYPE H EGF-LIKE SIGNATURE 


PR00010C 11.164.205e-O9 185-196 


141 


BL01113 


Clq domain proteins. 


BL01113A 17.99 5.673e-09 1621-1210 


141 


PR00011 


TYPE m EGF-LIKE SIGNATURE 


PR00011D 14.03 8.895e-12 551-132 
PROOOUB 13.08 5.846e-ll 551-132 
PR00011D 14.03 3.215e-10 313-332 
PR00011A 14.06 4.214e-10 313-332 
PR0001 IB 13.08 7.783e-10 313-332 
PR00011A 14.06 7.781e-09 551-132 


141 


BL00420 


Speract receptor repeat proteins domain 


BL00420A 20.42 8.200e-09 1186-1215 
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proteins. 




141 


PD02510 


ISOMERASE GALACTOSE-6- 
PHOSPHATF 


PD02510B 18.31 8. 170e-09 548-144 


141 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261F 11.57 9.544e-09 1052-1074 


141 


PR00288 


PUROTHIONIN SIGNATURE 


PR00288C 10.15 9.165e-09 311-326 


142 


DM01970 


0 kw ZK632.12 YDR313C 

TTMTinCl^lTVjf AT TTT 
HINJJUoUMAb ill. 


DM01970B 8.60 4.750&-17 114-565 


142 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.373e-09 203-257 


1 

142 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 1223 4.000e-09 559-130 


142 


BL00422 


Granins proteins. 


BL00422E 26.86 8.615e-09 462-498 


143 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 5.846e-15 141-154 PD00066 
13.92 9.217e-ll 551-564 PD00066 13.92 
6.700e-09 523-536 


143 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.526e-ll 122-136 
PR00048A 10.52 2.174e-10 532-546 
PR00048A 10.52 6.087e-10 588-164 
PR00048B 6.02 7.632e-09 138-148 
PR00048A 10.52 8.920e-09 504-518 


143 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 8.920e-09 59-72 


143 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.577e-ll 535-114 BL00028 
16.07 2.200e-10 125-142 BL00028 16.07 
5.800e-10 507-524 BL00028 16.07 8.714e- 
09 591-1 /U tJJLUUuzo 10.07 9.743e-09 444- 
461 


144 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 3.672e-10 262-285 


i*¥k 


DT AAO 1 < 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 7.900e-15 16-41 
BL00215A 15.82 8.147e-14 260-285 

TjT AAOI C A 1 C 1 OA>Ia AA 1 £L£ A A1 

DLUU215A 15.82 1.804e-09 166-191 
BL00215B 10.44 5.500e-O9 114-127 


\AA 


PpA/VQO'7 

rKUWZ / 


A TYJ7KTTKTT7 XTTT/T IWVTTTVD 

TRANSLOCATOR 1 SIGNATURE 


TlT> A AAO TD 1.4 ^iC O £yM A AA 1A4 1*1^ 

PR00927B 14.66 8.644e4)9 104-126 


1il*7 
l*t/ 


JJMU141 / 


MUSHROOM SPAC22G7.04. 


DM014 17C 12.93 3.250e-ll 267-279 
DM01417D 11.08 2.200e-10 306-322 


148 


BL01160 


XVUlwOUA 11{£JJ.I WHO HI IGfJGAt UIUlGUla. 


DLUl lOUD ly.Jt O.O /OC-1U j*t7-*JV/j 


151 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.807e-l 1 419-434 
PR00049D 0.00 8.125e-ll 1284-1299 
PR00049D 0.00 3.929e-10 1283-1298 
PR00049D 0.00 3.288e-09 417-432 


151 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 3.553e-09 416-466 


154 


BL00665 


Dihydrodipicolinate synthetase 
proteins. 


BL00665D 14.76 1.000e-ll 109-132 
BL00665C 25.58 5.832e-ll 50-101 


154 


PR00146 


DIHYDRODIPICOLINATE 
SYNTHASE SIGNATURE 


PR00146D 16.26 2.525e-10 108-126 
PR00146A 12.62 8.615e-09 13-35 


156 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSEUDOURIDINE LYASE TR. 


PD02906C 24.17 9.1 15e-15 171-206 
PD02906B 15.35 4.886e-13 142-155 
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PD02906D 12.27 1.000e-09 239-249 
PD02906A 10.84 8.333e-09 92-105 


157 


BL001O7 


Protein kinases ATP-binding region 
proteins. 


BL00107B 13.31 2.286e-ll 396-412 
BL00107A 18.39 6.148e-U 332-363 


157 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 4.938e-09 332-351 


160 


PF01008 


Initiation factor 2 subunit 


PF01008B 25.59 9.171e-36 366-409 
PF01008A 20.14 8.676e-12 315-336 
PF01008C 12.25 7.382e-10 449-469 


161 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591D 8.33 6.167e-09 2099-2112 


163 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.120e-09 99-113 
PR00019B 11.36 7.840e-09 73-87 


164 


BL00198 


Nt-dnaJ domain proteins. 


BL00198A 8 07 3.000e-14 143-160 


164 


PR00187 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00187A 12.84 8.800e-12 139-159 


165 


PR00310 


ANTI-PROLIFERATIVE PROTEIN 
BTG1 FAMILY SIGNATURE 


PR00310B 10.59 4.000e-39 41-71 
PR00310C 12.74 2.256e-33 71-101 
PR00310D 9.109.820e-33 101-131 
PR00310A 11.17 7.000e-27 1641 


165 


BL00960 


BTGl family proteins. 


BL00960B 24.47 1.000e-40 34-79 
BL00960C 12.68 6.745e-21 98-120 
BL00960A 10.98 5.304e-12 14-26 


166 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.688e-21 124-174 


166 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CY CLOHEXIMIDE. 


DM00973A 21.17 4.162e-10 96-133 


166 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12.76 3.520e-13 456-478 
PR00171E 14.87 2.750e-09 479-492 


166 


PR00172 


GLUCOSE TRANSPORTER 
SIGNATURE 


PR00172D9.13 6.513e-09 456-480 
BL00216B 27 64 5 198e-20 124-174 

xj xj(\j\j iuli x* i %\jr~% ^%xs*j\j *d\j Xm* i a f i^ 


167 


BL00216 


Sugar transport proteins. 




167 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CY CLOHEXIMIDE. 


DM00973A 21.17 4.162e-10 96-133 


168 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.929e-32 59-98 


168 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.385e-15 520-533 PD00066 
13.92 2.800e-14 296-309 PD00066 1352 
5.200e-14 240-253 PD00066 13.92 5200e- 
14 548-561 PD00066 1352 9 400e-14 436- 
449 PD00066 13.92 1.000e-13 324-337 
PD00066 13.92 6.143e-12 352-365 PD00066 
13.92 6.885e-10 268-281 


168 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 6.000e-12 237-247 
PR00048A 10.52 6.294e-12 333-347 
PR00048A 10.52 6.824e-12 361-375 
PR00048A 10.52 9.471e-12 249-263 
PR00048A 10.52 4.316e-ll 119-133 
PR00048A 10.52 4.789e-ll 529-543 
PR00048A 10.52 6.684e-ll 445^59 
PR00048A 10.52 8.141e-ll 305-319 
PR00048B 6.02 6.063e-10 321-331 
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PR00048B 6.02 6.063e-10 517-527 1 

PR00048B 6.02 7.750e-10 545-117 

PR00048A 10 52 2 80Qe-O9 389-403 
PR00048A 10.52 1.000e-08 417-431 


170 


PR00456 


RTBOSOMAL PROTEIN P2 
•SIGNATURE 


PR00456F 3 06 2 820e-1 1 6.21 PRfWMSfiF 
3 06 7 125e-10 3-18 


170 


PD02331 


CYCUN CELL CYCLE DIVISION 
PROTE. 


PD02331A 19.76 7.429e-15 93-140 
PD02331B 13.43 1.125e-09 174-207 


170 


PR00833 


POLLEN ALLERGEN POA PI 
STGNATTTRF 


PR00833H 2.30 5.269e-09 3-18 


171 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 4.706e-14 140-161 
PD00126A 22.53 6.824e-14 289-310 


173 


BL00741 


Guanine-nucleotide dissociation 
sumuiEiois kAjKsia ianuiy sign. 


BL00741B 14.27 3.418e-ll 294-317 


173 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 5.154e-ll 86-102 


1 1 J 


PPfifiil07 


INJttU 1 KUrrlUL Li iUoUL rAO I UK 
P40 SIGNATURE 


nn AA>1 A*TT\ 1 1 A 1 C A^O^ ia ni nt 

PRU0497D 11.91 5.962e-10 91-113 


173 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 6.442e-09 277-328 


175 


BL01016 


Glycoprotease family proteins. 


BL01016C 22.84 5292e-19 60-105 
BL01016H 13.71 6.157e-12 307-317 , 
BL01016E 14.88 3.182e-ll 141-169 
BL01016D 8.86 6.741e-09 118-131 


175 


PR00789 


O-SIALOGLYCOPROTEIN 
CdNJJvJr cr 1 IJJAoxi \Nu.Z) 
METALLO-PROTEASE FAMILY 

^TfyiMATTTPF 

OJLVJlN/Yl UIVT/ 


PR00789E 12.42 7.128e-14 141-163 
rK007o9C lo.l 1 2.707e-12 85-105 
PR00789B 10,48 1.205e-09 64-85 


176 


PR00850 


GLYCOSYL HYDROLASE FAMILY 
59 SIGNATURE 


PR00850B 6.67 5.455e-09 148-173 


178 


PR00259 


TRANSMEMBRANE FOUR FAMILY 

QTf^M A TT TP T7 


PR00259A 9.27 8.676e-20 17-41 PR00259C 
10.4U 4. 75Ue-l / 85-114 rK00255>IJ 14.81 
8.615e-1258-85 PR00259D 13.50 2.528e-ll 
235-262 


178 


BL00421 


Transmembrane 4 family proteins. 


BL00421B 17.62 6.186e-17 64-103 
BL00421A 11.79 6.800e-12 13-32 
BL00421E 20.97 1.514e-10 232-262 
BL00421C 12 89 3 600e-09 147-1 SQ 


178 


PR00235 


HERPESVIRUS MAJOR CAPSID 
PROTEIN (MCP) SIGNATURE 


PR00235A 14.64 8.000e-09 87-1 1 1 


179 


BL01052 


Calponin family repeat proteins. 


BL01052C 18.51 6.806e-4O 87-127 j 
BL01052A 16.12 7.180e-32 3-35 BL01052B 
15.31 8.031e-26 52-78 BL01052D 10.26 
1.000e-24 174-194 


179 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGELLN) 
SIGNATURE 


PR00890E 14.34 3.813e-21 135-155 
PR00890A 8.61 9.775e-21 34-54 PR00890C 
8.22 1.000e-17 84-98 PR00890B 8.75 
3.455e-17 62-78 PR00890F 12.92 4.064e-14 
161-174 PR00890D 16.17 5.174e-13 118- 
128 
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SFO ID 

ujjy AX/ 

NO: 


ASdltlUdaC 

entry ID 


uescripnon 


"Results 




PROORRB 
rivvuooo 


OIVav/w a aa IvlU ov^JLC' 

PROTEIN/CALPONaN family 
SIGNATURE 


JfKOUSosJi 9.97 5.154e-20 175-191 
PR00888C 12.27 5.179e-18 52-68 
PR00888D 16.09 4.273e-17 88-105 
PR00888A 11.87 2.350e-16 3-18 PR00888E 

1 1 Q1 'J Alt** 1iC 1A/I 1 OA Dt) AAOOOD n A A 

ll.ol J.43ze-lo 104-120 PROOooor 7.44 
4.oz3e-14 IZd-140 rKOOooou 12.73 o.759e- 
14 162-176 PR00888B 13.72 2.350e-12 22- 
36 


179 


PR00889 


CALPONIN SIGNATURE 


PR00889E 12.18 2.726e-12 171-187 


lOU 


DT AAR7C 


Bacterial type II secretion system 
protein D proteins. 


BL00875A 25.57 6.447e-09 367-399 


1521 

AO 1 




rKU I bJUN KEr HA 1 
NEUROFILAMENT TRIPL. 


PD01351B 13.72 5355e-09 238-264 






KW J KAN oCKIr 1 AbE KE VERSE II 
ORF2. 


DM01354H 18.00 8.826e-27 109-149 
DM01354G 11.57 2.143e-2S 78-109 

T\X /TA1 Tf /IT? i A C S" ■% A •% A 1 C Af% r-in 

DM01354F 14.56 1.414e-15 42-78 
DM01354E 18.69 8.650e-14 17-47 


182 




ixenai aipepnaase proteins. 


BL00869D 14.02 3.477e-09 67-96 


185 


BL00039 


DEAD-box subfemily A TP-dependent 
helicases proteins. 


BL00039A 18.44 4.000e-25 222-261 
BL00039D 21.67 4.529e-23 498-544 
BL00039C 15.63 4.300e-16 347-371 

DT AAAO AD 1 A 1 A A H C f\£^ Ann 

BL00039B 19.19 9.379e-15 262-288 


185 


PD00302 


PROTEASE POLYPROTEIN 

AA A lyi\UL(/itJlj /VOJT. 


PD00302B 9.52 1.346e-09 234-250 


186 


PD00066 


PROTEIN ZINC-FINGER METAL- 
RTNDT 


PD00066 13.92 5.714e-12 152-165 PD00066 
li.yz o.i4Je-i2 124-137 


186 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 6.885e-U 136-153 BL00028 
16.07 2.200e-10 197-214 


186 


PR Oft? V) 


TERMINAL TAIL SIGNATURE 


PK00239E 1.58 5.705e-09 420-432 


186 


PR 00048 


P9W> JTVPT? 7TKTP TTTWfTET? 

SIGNATURE 


rK0004oA 10.52 2.957e-10 133-147 
PR00048A 10.52 3.739e-10 194-208 

nn AAA /OA 1 A ^1 o rt^l. ia f/i « «f 

JrK00U4oA 10.52 o.043e-10 161-175 
PR00048B 6.02 8.105e-09 121-131 


187 


m Al 

r>i (V/iw^Zi 


1 TJ / Tamil T P #%*V\+4Vt* -i /Yi^-nA-rt^ rl A 

x^aaxz xanuiy proioii/oiigopepuue 
symporters proteins. 


BL0102215 22.19 4.240e-10 308-354 


187 




TKrHTRITM AT PT-I A PUATXT 
UNrUXJliN AI-jaIA v^nivUN 

SIGNATURE 


PR00oo9B 8.27 7.915e-09 264-281 


190 


PR00830 


ENDOPEPTIDASE LA (LON) 
SERINE PROTEASE (S16) 
SIGNATURE 


PR00830A 8.41 3342e-09 881-901 


191 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9234e-13 261-280 


191 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.000e-23 261-292 
BL00107B 13.31 1.000e-12 341-357 


191 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BL00239B 25.15 6.523e-10 196-244 


191 


BL00479 


Phoibol esters / diacylglycerol binding 
domain proteins 


BL00479C 12.01 1 .000e-09 320-333 


191 


PR00834 


HTRA/DEGQ PROTEASE FAMILY 


PR00834F 10.91 2^46e-09 786-799 
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SEQED 
NO: 


Database 
entry ID 


Description 


♦Results 






SIGNATURE 




193 


BL01033 


Globins profile. 


BL01033A 16.94 2.385e~18 25-47 


1Q1 
i y o 




SIGNATURE 


PPfifiRIAA 19 QA 1 (\(\(\r* 99 lft_A7 ! 

PR00814B 9.18 7.750e-18 48-64 


1Q3 
iyj 




AyTVnfrT ORrM ^T/WATTTPF 


PPflAI 7^R O (Y) 0 1Q9a 1 (\ 9*-AO 


1QA 




C* PP OTP TXT RPTA WHJlft PPPP AT 
CJOM ATT FRF 


PPfWY*9fiD 19 10£99£*k 11 1AA_1^< 
JrivUUjZUo 1Z.1? O.ZZoe-1.1 i4U-lDD 

ppnn^9fiA i^7AAQ7i<»iniAn is* 
PR00320C 13.01 9.280e-10 140-155 






J 1 "Vt\_ Arts (Wf 1 1\ T*ar\^of nrr\tAi'nc nrntAine 
A rprrVbp yWLJj ICpCal piLHGlIlb piOlGlIlS. 


RT iin^7» 0 A7 7 A79fvJiQ 1 A9 1 K*X 


196 


PR00832 


PAXELLIN SIGNATURE 


PR00832B 9.87 9.174e-10 309-333 


196 


BL01160 


Kinesin light chain repeat proteins. 


BL01 160B 19.54 2.054e-10 376-430 
BL01160B 19.54 6.919e-10 383-437 
oLUIIoUd iy.54y.O/oe-IU3oy-423 


196 


PR00049 


WILIvTS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.780e-09 40-55 


196 


BL00087 


Copper/Zinc superoxide dismutase 
proteins. 


BL00087C 20.18 8.784e-09 260-296 


196 


PR008O6 


X TTVT/TT TT TXT <~t T/"«X T A HPT TTT» T"> 

VINCULIN SIGNATURE 


PR00806A 6.63 9.014e-09 308-319 


tyo 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 9.143e-09 506-540 [ 


197 


PR00674 


LIGHT HARVESTING PROTEIN B 

PTT A TXT Oir'XT A TT TO T? 

CHAIN SIGNATURE 


PR00674A 20.10 7.391e-09 134-155 


1 AO 

198 


PR00192 


"17 A / V 1 ' 1 kT /""* A TIT1TX T/*~* T»T1 /^TTJTVT TiTTT A 

F-ACTTN CAPPING PROTEIN BETA 
SUBUNIT SIGNATURE 


PR00192C 6.65 2.500e-36 57-84 PR00192D 
8.23 4.462e-36 97-125 PR00192E8.85 

T AAAa Ol Oil aoa DDAAIO/1 A O t AHA* 

/.UOUeoi 212*23> PKUOiyZA o.23 1.474e- 
z / 3-2o rKUllly/c O.ZU 3.UUUe-ZO 20-4o 


1QO 

lyo 


L>LAJVZj x 


r-acun capping protein oeta suoumt 
proteins. 


X>Laj\)/.j lAo.jy L.Wve-Hi) DO 1 X5LUU23 1 U 

14.16 1.000e-40 84-128 BL00231D 15.40 
1.000e-40 165-200 BL00231E 11.66 l.OOOe- 

AT\ 9 fiQ-9A£ m (\f\0^ 1 r* 1 9 77 1 1 8fl*» 1^1 AA- 

157 


199 


PF0A023 


/vi uv repeal pruiciub* 


rruu^<}A L\J»\JD *t«/«>i/e~iu *fj-oi 


199 


PF00791 


Domain present in ZO-1 and UncS-like 

lie Li ill rcucpujio. 


PF00791B 28.49 8.768e-12 87-142 

PFfin7Q1R 951 AO 7 09R*»_nO AOO 1 1 £ 


199 


BL01160 


Kinesin light chain repeat proteins. 


BL01 160E 8.74 7.398e-09 323-362 


901 

ZU1 


PPfifl9^Q 


Yjtfxt t t TQr* am p TJfYTWTP QTW P 
lVIL/JUl-rUovvAIN JSJnL^J J-/\J r o JIN v> 

TERMINAL TAIL SIGNATURE 


PP HfYJIOP 1 & 1 1 Aa_AO 1 1 

rivuuzji/ii i.3o o.ii*ie-uy iw-iyo 


202 


BL00412 


NeuroTTiodiilin ftrAP-43 1 nmfcpin^ 

1 1 will VUlvUUUU y\_l4U. I *J J |JJL vrlvUUd. 


BL0041 2D 16 54 4 033e-10 3 1 9-370 


202 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 4.845e-09 313-366 


202 


PF00992 


Troponin. 


PF00992A 16.67 8.734e-12 333-368 
PF00992A 16.67 2.776e-09 344-379 
PF00992A 16.67 5.026e-09 351-386 


203 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16^0 7.677e-09 29-73 


204 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16.20 7.677e-09 29-73 


205 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16.20 7.677e-09 29-73 


207 


BL00211 


ABC transporters femily proteins. 


BL00211B 13.37 3.077e-17 573-167 
BL00211B 13.37 7.577e-17 1204-1674 
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NO: 


fintrv ID 


npcrrintfnn 










BL00211A 12.23 1. 900e-09 472^84 


207 


PR00478 


PHOSPHOR TRT IT jOKTNA^F FAMTT Y 
SIGNATURE 


PRflfl47R A 1 "\ AA A 1 TIp-OQ Aid-AG? 


507 


PR 00807 


cpTRT TX4 AT RI lA/TTM FAMTT V 

SIGNATURE 


PRH0R07r} 1A 57 7 T8«^_n0 071 OOA 


707 




COM A TDTP OPTNJ ITnRlWffWF 

FAMILY SIGNATURE 


ris\j\}oo\jiJ Lj.yjj /.izje-uy ijU4-iDiy 


70Q 


PR ftflOAQ 


WTT TT rM/YT TP PR OTT7TM 
W liAVl o 1 U JVJ.U UXv xivL^ 1 HlfN 

SIGNATURE 




210 


BL00972 


Ubiquitin carboxyl-tenninal hydrolases 
family 2 proteins. 


BL00972D 22.55 3.348e-ll 388-413 

oLUUy /2JtS 2U. /2 4.J4Je-Uy 415-437 


210 


PR00198 


ANNEXIN TYPE H SIGNATURE 


PR00198H 12.05 7.750e-09 682-696 


214 


PD00469 


PROTEIN PRECURSOR SIGNAL 
rlYDKOLA. 


PD00469A 13.95 6.400e-09 73-86 


215 


PF00023 


Auk repeat proteins. 


PF00023A 16.03 8.875e-10 839-855 
PF00023A 16.03 2.286e-09 884-900 


215 


PR00342 


T* T TT? OT TO TT>T ^/"Vrx /T» /"XT TT» nTJ/\TRT\T 

RHESUS BLOOD GROUP PROTEIN 
SIGNATURE 


PR00342H 7.61 9.703e-09 317-340 


21 1 


BJU/Uyoz 


Bacterial-type phytoene dehydrogenase 
proteins. 


BL009S2A 18.41 8.013e-12 328-360 


217 


PR00368 


FAD-DEPENDENT PYRIDINE 
NUCLEOTIDE REDUCTASE 

QTrJW A TT TP 17 


PR00368C 15.74 8.962e-ll 326-352 


217 


PR00469 


PYRIDINE NUCLEOTIDE 

r\TQT TT "DITTTM? "DTJTYT TOT A CD OT A CO 

n SIGNATURE 


PR004691 13.83 7.532e-ll 449-468 
PKUU4o>r lo.51 7. 152e-0y 322-347 


717 
Zl 1 




iivvJIN -D U l-»r UK lilJivv 1 JtvAJiN 

TRANSPORT AROMATIC 

HYTiROPARR 


rDUZU42H lo./D D.O/3e-Uy 120-141 

PD02042A 21.13 9.045e-09 93-120 


217 


PR00419 


ADRENODOXIN REDUCTASE 

FAA/TTT V QiYTXTATTTPT? 
riVJVllLri olVJlNAl UJtVCr 


PR00419A 14.89 9.486e-09 326-349 

rKUU4 1 yu 1U.OZ 7.3 34e-U^ 3 Z / -34Z 


218 


PF00157 


PDZ domain proteins (Also known as 
DHRorGLGF). 


PF00157 13.40 4.600e-09 688-699 


219 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 7.000e-23 65-96 
BL00107B 13.31 4^14e-10 130-146 


219 


PR00109 


TYROSINE KINASE CATALYTIC 

T-\/^\ K A TVT piT/-rvT A TT TT>T3 

DOMAIN SIGNATURE 


PR00109B 12.27 7.102e-10 65-84 1 


219 




ixcccpiur tyrosine Kinase class iu 
proteins. 


"RT (\(Y)AflV 11 KUK H7Qa_AQ ^1_CQ 
X>LvUIFZ4UJD 11.30 O.UZUe-UJJ D1-07 


220 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.045e-09 38-50 


220 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN 
H. 


DM01803A 10.51 9.349e-09 34-55 


220 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.160e-l 140-55 PR00049D 
0.00 7.807e-l 141-56 PR00049D0.00 
8.336e-ll 38-53 PR00049D 0.00 2^86e-10 
42-57 PR00049D 0.00 8.857e-10 33-48 
PR00049D 0.00 2.983e-09 37-52 PR00049D 
0.00 9.847e-09 43-58 


222 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 5337e-10 825-859 
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222 


BL01160 


Kinesin light chain repeat proteins. 


BL01 160B 19 54 9 924e-09 516-132 


224 


BL00478 


T X\A HnirifliTi nimtPiTiQ 

1 /UYl UIMiuUii JJi VI IE/ 1 1 1»> . 


RTAAA78R 1A 7Q 8 *\97a-AQ 1A1-1S8 


226 


BL00048 


Protamine PI Droteins. 


BL00048 6 39 6 063e-09 199-226 




RT 1)01 1 5 


cum ijrutxc xvi>/i. puiyinerast? xi 


PT AA1 1 ^7 119^ TAAt* 1 A 111 1 A9 






lfl.pt p HUG 1 CijJt/U.L JJlUlvlUd* 


PTAA1 1^7 1 19 1 A4Qp AQ 19A 1£Q 


aUjLO 




oyiiapbiiio proicinb. 


DT AA41 <n 9 97 R 791<»_A0 9^1 9C0 


29Q 




Irll 1 /'ncoiTiitKS / <*tq1 or , tr\Pomin<a_/^_ 

vJl UUUoall IlZlc/ galavtOSalllinC-O- 


UT A1 1 A1 A 1 0 A7 1 AAAo^AA 17 79 






pnuspnaie lbomerascs proteins. 


RT A1 1 Air* 9fi 14 1 AAAo_4A lOO 9AA 








nrnn^iR9i 17 ^ ao1a_iq ii7.i#yA 








RT.A1 1 A1P 18 d7 1 ^AAa-91 17A-100 


231 


PR00269 


PLEIOTROPH1N/1VTTDKTNR FAMTT Y 


PR0026QA 1^ 01 ^ T^^ft-^A 88-1 1 4 ^ 






SIGNATTJRF 






m A0181 


x i iN/ivuv ucp<uXQHwiiiuiii^ protein 


HT AA1 ftl A 10 A*7 A OAA*»_19 9£_1 1 9 






laiiiiiy proteins. 


HT-AA1 CIA 1 0 .A9 O 99/l« 1 0 9Q 1 1 A 


236 


Jji_AJUOOO 


\_/yciiv iiucicviius-oinuing ooinain 


TIT AAfiCCn 1/1 TO O n<Oo 11 /IOQ <71 






prULClHo. 




236 




QvniJT^ct'nc nmtpinc 
oyuapoiiio pxuiciiid. 


PT AAAl ^ISJ A 90 9 77A*»-AO 711 777 




phaaiaa" 


ri\,u i c Jin vjjL/ 1 v^L^r ivw i hijn 


PnAAlA^A 1A 9/C 1 1 11a_AO <A< XAfl 






PPT^TTTP^OP PJ7 




91A 


pp AA9 ao 


ATPWA/PPTA nT TATYTKT PAUvTTT V 


rKUUzOyo 4.oo 3.ol3e-U9 739-75o 






QT< r 3M A TT TP T? 




91A 


UMUUOOo 


7TDTNT 


DM00668A 10^0 8.500e-09 258-273 




OT A1 1 oo 

BLUlltts 


GNS1/SUR4 family proteins. 


BL01188B 13.46 4.115e-26 120-151 








BL01188C 22.65 4.136e-26 151-202 








BL01188D 8.62 1.290e-ll 238-255 








BL0H88A 18.82 6.718e-10 55-87 


*>3A 


PR00929 


A T» TTA/MT T TVT7 1TN/"\» X A TV T 

AT-HOOK-LIKE DOMAIN 


PR00929B 4.38 8.875e-09 133-583 






SIGNATURE 


PR00929C 526 8.914e-09 133-144 


242 


BL00232 


Cadhenns extracellular repeat proteins 


BL00232B 32.79 2.765e-25 541-151 






domain proteins. 


BL00232B 32.79 8.263e-22 766-814 








BL00232B 32.79 2.397e-21 67-115 








BL00232B 32.79 4.133e-19 1481-1529 








BL00232B 32.79 1.000e-18 1371-1419 








BL00232B 32.79 2.662e- 18 1691-1739 








BL00232B 32.79 5.292e-18 1287-1335 








BL00232B 32.79 9.147e-18 1148-1196 








BL00232B 32.79 1.265e-17 980-1028 








BL00232B 32.79 1.529e-17 426-474 








BL00232B 32.79 2.588e-17 1084-1132 








BL00232B 32.79 1.386e-16 1184-1232 








BL00232C 10.65 5.390e-12 1369-1387 








BL00232C 10.65 1.391e-ll 204-660 








BL00232C 10.65 2.174e-ll 1584-1164 








BL00232C 10.65 4.522e-ll 1689-1707 








BL00232C 10.65 l.OOOe-10 65-83 








BL00232C 10.65 4.115e-10 1285-1303 








BL00232B 32.79 7.200e-10 649-697 








BL00232C 10.65 9.827e-10 978-996 








BL00232C 10.65 1.947e-09 170-188 








BL00232B 32.79 2.137e-09 172-220 
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'Results 








BL00232C 10.65 4.474e-09 1182-1200 
RT name 1 0 6$ R 717p-ftQ 1 Q 


243 


BL00795 


Involucrin proteins. 


BL00795C 17.06 4.977e-10 64-109 


244 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL007901 20.01 7.823e-15 23-54 BL00790I 

7A HI O AftOtx 1 1 11 A 1A1 RT AA7QAT 7A A1 

1.900e-10 117-148 BL00790I20.01 3.893e- 
09215-246 


244 


PR00014 


FEBRONECTIN TYPE III REPEAT 

OT/TXT A TT TD t? 


PR00014D 12.04 6.400e-ll 30-45 

DDAAA1 An 10 (\A A AAAo 1 1 117 
rK.UUUl.4LI 1Z.U4 0.4UUe-ll jI/OjZ 

PR00014C 15.44 9.171e-09 204-223 

DDnAAl AT\ lO ftA 1 AHAa AC OIO 01*7 

rK0UU14U 1Z.U4 i.uUue-Uo ZZZ-Z5/ 




DT AA1 0 


Ubiquitin-conjugating enzymes 
proteins. 


DT nAI Ol 1<? AT *7 AITo. 1 A 1/f A 1 OQ 

JtJLUUloi Zo.y/ /.uJ/e-JLu l4u-lo& 


246 


PR00019 


LEUCINE-RICH REPEAT 

OT/T.XT A TT TD T7 
MUNAlUKJl 


PR00019A 1 1.19 8.800e-12 205-219 

I>I>AAA1AD 1 1 K O AAAa 1 1 OUC 

rKOuuiyrJ li. Jo z.uuoe-ii iuz-zio 


247 


BL00214 


Cytosolic fatty-acid binding proteins. 


BL00214B 26.51 7.180e-24 206-251 
BL00214A 21.17 6.250e-22 165-191 


247 


PR00178 


FATTY ACID-BINDING PROTEIN 
SIGNATURE 


PR00178A 15.07 4.913e-21 166-187 
rKOOl/ot/ Z0.54 Z.5UUe-l / ZZ0-Z34 
PR00178D 13.52 6.897e-16 272-291 
PR00178B 10.52 4.900e-10 200-212 


OAQ 

Z4o 


PRUU3SO 


D TD r\Cf\\ /AT DD /YTTPTKT CI 

SIGNATURE 


rJvUUJJOC iO.17 Z.04/e-l J 4O-04 


248 


BL00962 


Ribosomal protein S2 proteins. 


BL00962C 15.90 2.846e-12 46-64 


Z4y 


r>T AAOT7 

D.LUUZZ / 


Tubulin subunits alpha, beta, and 
gamma proteins. 


dLAJvZZIU 1o.4o l.UUUe-4U /4-lZo 

BL00227F 21.16 1.529e-33 226-280 

RT AA777R 7 A 1^1 A(\Qt* 7£ 1 7ft 711 

dl\j\)zz /xl Z4. i o i .4iiye-zo 1 / o-z 1 :> 




DT AA777 


lUDUiin suduxuls aipna, Deia, ana 

BcUUula nil itJcirtS- 


RT AA777f* 7*< Aft 1 AAAo AA Qi 
DLAJUZZ/lt Z3.45 l.UUUe-4U J 7-1^1 

RT AA777n 1 ft AA 1 nflAi»_Aft 1 A8-7A7 
DJ-AJUuiZ/U 10.40 l.UUUe-4U 140-ZUZ 

BL00227F 21.16 1.529e-33 300-354 


251 


BL00152 


ATP synthase alpha and beta subunits 
proteins. 


BL00152B 21.40 1.900e-31 191-229 
BL00152A 15.38 5.154e-21 134-160 


252 


BL00152 


ATP synthase alpha and beta subunits 
proteins. 


BL00152E 22.68 1.000e-32 285-323 
BL00152A 15.38 5.154&-21 134-160 
BIj00152C 11.41 6.250e-12 247-259 


253 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-63 


253 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 


254 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 9.739e-12 417-451 


254 


PR00417 


PROKARYOTIC DNA 
TOPOISOMERASE I SIGNATURE 


PR00417A 12.66 8.472e-09 65-79 


255 


BL01052 


Calponin family repeat proteins. 


BL01052C 18.51 1.000e-40 88-128 
BL01052A 16.12 2.875e-35 3-35 BL01052B 
15.31 5^19e-26 52-78 


255 


PR00888 


SMOOTH MUSCLE 
PROTEIN/CALPONIN FAMILY 
SIGNATURE 


PR00888D 16.09 9.112e-19 89-106 
PR00888E 11.81 2.800e-18 105-121 
PR00888F 7.44 4.600e-18 126-141 
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PR00888A 11.87 7.750e~18 3-18 PR00888C 

17 77 1 OC£a 1*7 C7 £0 DDAAOOOP 11 7*1 

12.27 2.2ooe-l / 02-00 JrKOOoooCr 12.73 

Q A1Q<=> 1C 1£1 1*7*7 DDAAQQQD 17 77 1 'ill** 

y.43oe-15 103-1// rKUUooOD 13. fZ 1.32 le- 

1422-36 


255 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGELIN) 

CTV1XT A TT TD 13 

MvjinAIUKe 


PR00890E 1434 1.429e-27 136-156 
PR00890A8.61 1.000f>26 34-54 PR00890C 

O 11 1 £AAo lO QC AO DBAAOAA& O 7C 

8.22 i,oooe-iy oo-yy rKOOoyoi? 0.75 
6.318e-19 62-78 PR00890F 12.92 1.205e-17 

1X1 17C DDAAOAAFI IX 17 t f-JAa lO 11A 

102-1/5 JrKOOoyOlJ 10.17 1.13Ue-13 liy- 

1 1Q 

12y 


257 


BL00745 


Prokaryotic-type class I peptide chain 
release factors signat 


BL00745C 13.66 1.000e-40 202-249 

DT AA7>I<D *)*) « fi XQla 1*1 1/Q lOI 

dJLAJU/4DJd 22. 50 o.0o5e-33 140-171 

BL00745D 14.90 8.435e-23 280-303 


ICQ 


DT AAIOA 


Thioredoxin family proteins. 


T)T AA1 O/l 1 1 1 £ *7 /IIOa 1 A XOil <07 

15JLUU1V4 12.10 /.42ye-iu oo4-oy/ 


260 


BL00612 


Osteonectin domain proteins. 


BL00612E 13.12 3.948e-10 391-436 


1/CA 
200 


J5L004o4 


Thyroglobulin type-1 repeat proteins 
proteins. 


DT f\(\AO Af* 1*7 A1 O 1j4/I a 1 1 1 CI 

r>L004o4O 17.01 8.244e-ll 136-151 

DT AA/lfMD O A>l 1 1 A C* 1 A 1VIO 1X7 

Dl//U4o4o y.U4 2.145e-10 245/-203 
PT AA/ffcAf 1 1*7 A1 1 IAOa^AQ 1XG IftA 

dlaju4o4^ l/.ui 2.3uye-uy 2oy-2o4 
BL00484B 9.04 8.950e-09 116-130 


Iff) 


irKUUlo/ 


JJIN/VJ x tvvJ 1 HUN r /\ [ Vj n , Y 

SIGNATURE 


PPAA1C7A 11 flAO 17<« OO 10ft OAS 
rlvOOlo / A 12.04 2.3 /5e-0y 200-500 


262 


BL00198 


Nt-dnaJ domain proteins. 


BLO0198A 8.07 3.681e-09 292-309 




dt nni <7 


Aminotransferases class- V pyndoxal- 
phosphate attachment site proteins. 


DT AA1 C7 All *71 O. IAAa AO 1 X IX 

0LOO15/A 11. /2 o.2Uue-uy 10-20 


?a"1 
zoo - 




r± pp OTRTM PPT A A P PPP A T 
\J-srssXJ 1 BUS BB 1 A WU-4U KJZrcJX I 

SIGNATURE 


PPAA17AP 1110 1 11Ca_A01A7 111 
rtssJVjZUB 12. ly 2.123c-Uy ZUf-ZZZ 






irypanosome vanani sunace 
glycoprotein. 


PPAAQ1 1 A 7 11 1 <AAo_AO £X£_X71 
rrUUyi^A /.3j 2.jUUe-Uy 00O-O/5 


266 


BL01144 


Ribosomal protein L3 le proteins. 


BL01144 25.07 1.000e-40 21-73 


zoo 


tymaaox 


1 Q£ rfcTCrYYITYnvT T XT TTj"D A/fTXT A T 


TYIVVTAAC 1 X *2 A CI Q 1 XQa 11 1 CI 1 AO 

DMUUMo 30.53 o.looe-15 153-iyo 


268 


BL00132 


Zinc carboxypeptidases, zinc-binding 
region 1 proteins. 


BL00132C 21.35 7.863e-10 307-348 
dUJUI j/A ZO.O/ S.yaoe-lU 2Z4-Zo5 


268 


PR00765 


CARBOXYPEPITDASE A 
METALLOPROTEASE (M14) 
FAMILY SIGNATURE 


PR00765B 15.57 7.171e-12 276-291 
PR00765D 14.16 1.551e-09 420-434 


Zoo 


or aai 7A 


Cyclophilin-type peptidyl-prolyl cis- 
irans isomerase signaiur. 


DTAA17AA 17 AO A AICa AO AQC C11 

rJLUUl /OA 17.0o y.01oe-0y 485-512 


269 


BL00622 


Bacterial reenlatorv nmteinQ lnvP 
family proteins. 


BL00622 32 69 9 780e-09 11-58 


270 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.000e-ll 447^61 
PR00048A 10.52 4.316e-ll 389-403 
PR00048A 10.52 6.684e-ll 362-376 


270 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 3.143e-10 37-50 


270 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.000e-10 392-409 BLO0028 
16.07 9.100e-10 256-273 BL00028 16.07 
2.286e-09 450-467 BL00028 16.07 8.714e- 
09365-382 


274 


DM00303 


6 LEA 1 1-MER REPEAT REPEAT. 


DM00303A 13.20 3.310e-09 467-517 


275 


PF00622 


Domain in SPla and tie RYanodine 


PF00622B 21.00 9357e-14 374-396 
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Receptor. 


PF00622C 12.62 1.857e-12 458-472 


275 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 8.800e-ll 44-53 


277 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF0065 1 15.00 9. 133e-10 65-78 


278 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BIND! 


PD00066 13.92 8.200e-16 295-308 PD00066 
13.92 8.200e-16 519-532 PD00066 13.92 
1.692e-15 351-364 PD00066 13.92 4.462e- 
15 547-122 PD00066 13.92 4.600e-14 323- 
336 PD00066 13.92 4.600e-14 435-448 
PD00066 13.92 7.000e-14 463^76 PD00066 
13.92 1.500e-13 239-252 PD00066 13.92 
3.143e-12 267-280 PD00066 13.92 3.143e- 
12407-420 PD00066 13.92 8.826e-ll 211- 
224 PD00066 13.92 2.038e-10 491-504 
PD00066 13.92 2.385e-10 379-392 


278 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-16 444-458 
PR00048A 10.52 6.727e-15 360-374 
PR00048A 10.52 9.182e-15 528-542 
PR00048A 10.52 7.000e-14 472-486 
PR00048A 10.52 7.750e-14 388-402 
PR00048A 10.52 1.000e-13 332-346 
PR00048A 10.52 3.133e-13 304-318 
PR00048A 10.52 4.857e-13 118-132 
PR00048A 10.52 6.786e-13 500-514 
PR00048B 6.02 1.000e-12 292-302 
PR00048A 10.52 8.941e-12 192-206 
PR00048B 6.02 1.000e-ll 348-358 
PR00048A 10.52 1.947e-ll 248-262 
PR00048B 6.02 2.385e-ll 264-274 
PR00048B 6.02 7.231e-ll 544-116 
PR00048A 10.52 7.632e-ll 416-430 
PR00048B 6.02 8.615e-U 236-246 
PR00048B 6.02 2.688e-10 516-526 
PR00048B 6.02 4.375e-10 46O470 
PR00048B 6.02 4.375e-10 488-498 
PR00048B 6.02 4.938e-10 404-414 

PR00048A 10.S2 7.2l4e-10 220-234 
PR00048B 6.02 1.947e-09 432-442 
PR00048B 6.02 4.316e-09 572-144 


278 


DM01970 


0kwZK632.l2YDR3l3C 
ENDOSOMAL IH. 


DM01970B 8.60 5.012e-09 191-204 


279 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 6.400e-16 449-462 PD00066 
13.92 6.538e-15 504-517 PD00066 13.92 
9.308e-15 42M34 PD00066 13.92 7.000e- 
14476-489 PD00066 13.92 6.087e-ll 393- 
406 


279 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.500e-17 350-367 BL00028 
16.07 5.050e-13 405-422 BL00028 16.07 
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9.171e-12 433-450 BL00028 16.07 2.731©- 
11 








488-505 BL00028 16.07 3.077e-ll 516-533 

RT/tflfttft 07 A Ififip. 1/1 377 *QA 
J5JLA>UUZo lO.U/ O.lUUc-lU jt /-J7*t 


279 




PRfYTFTM ROT A TR A "W<3n? TPTTfYW 

REGULATION AC. 


PTlA74i\7A 07 ASi £ 4&&o_AO 


27Q 




P7W7-TVPI7 TTNIP PTMOPP 

SIGNATURE 


PR00048B 6.02 5.154e-ll 501-511 
PR00048B 6.02 1.000e-10 446-456 
PR00048A 10.52 1.391e-10 513-527 
PR00048A 10.52 2.565e-10 485-499 
PR00048A 10.52 5.696e-10 402-416 
PR00048B 6.02 8.875e-10 418-428 
PR00048A 10.52 1.720e-09 430-444 
PR00048B 6.02 3.368e-09 390400 

DDAAA.4QA 1A <1 9 OAA- AO 1~7A OOO 


285 J 


BL00276 


Channel forming colicins proteins. 


BL00276A 8.87 6.500e-09 257-269 


286 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.000e-30 10-49 


286 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 6.400e-16 388-401 PD00066 
13.92 3.769e-15 248-261 PD00066 13.92 
9.308e-15 304-317 PD00066 13.92 2.200e- 
14 360-373 PD00066 13.92 2.200e-14 416- 
429 PD00066 13.92 6.4O0e-14 332-345 . . 
PD00066 13.92 1.000e-13 220-233 PD00066 
13.92 2.500e-13 192-205 PD00066 13.92 
5.000e-13 276-289 PD00066 13.92 5.500e- 
09 136-149 


286 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.286e-16 260-277 BL00028 
16.07 2.588e-14 288-305 BL00028 16.07 
2.800e-13 400-417 BL00028 16.07 6.850e- 
13 120-137 BL00028 16.07 3.423e-ll 148- 
165 BL00028 16.07 7.923e-ll 344-361 
BL00028 16.07 2.500e-10 204-221 BL00028 
16.07 2.500e-10 428-445 BL00028 16.07 
3.100e-10316-333 BL00028 16.07 6.lQ0o~ 
10 176-193 BL00028 16.07 1.771e-09 232- 

7A.Q RTjf)fMV?R isms 90A. An ^7?_1R0 


286 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.000e47 257-271 I 
PR00O48A 10.52 6.727e-15 397-411 
PR00048A 10.52 2.929e-13 285-299 
PR00048A 10.52 9.471e-12 369-383 
PR00048B 6.02 1.000e-ll 329-339 
PR00048A 10.52 1.474e-ll 313-327 
PR00048A 10.52 2.421e-ll 425-439 
PR00048B 6.02 3.077e-ll 385-395 
PR00048A 10.52 6.684e-ll 117-131 
PR00048A 10.52 8.141e-ll 201-215 
PR00048A 10.52 1.783e-10 341-355 
PR00048B 6.02 2.125e-10 301-311 
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PR00048B 6.02 2.125e-10 357-367 
PR00048B 6.02 2.688e-10 217-227 
PR00048A 10.52 3.739e-10 229-243 
PR00048B 6.02 4.938e-10 273-283 
PR00048B 6.02 1.474e-09 245-255 
PR00048A 10.52 2.440e-09 145-159 
PR00048B 6.02 3.842e-09 161-171 
PR00048B 6.02 8. 105e-09 441-451 
PR00048B 6.02 9.053e-09 189-199 


287 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.407e-23 3-42 


287 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 8.941 e-14 269-286 BL00028 
16.07 1.00Oe-13 549-128 BL00028 16.07 
2.565e-12 194-650 BLOO028 16.07 6.087e- 
12 241-258 BL00028 16.07 6.870e-12 297- 
314 BL00028 16.07 6.870e-12 381-398 
BL00028 16.07 7.214e-12 493-510 BL00028 
16.07 1.346e-ll 465482 BL00028 16.07 
1.692e-ll 353-370 BL00028 16.07 3.769e- 
11 325-342 BL00028 16.076.192e-ll 167- 
622 BL00028 16.07 8.962e-ll 213-230 
BL00028 16.07 1.600e-10 409-426 BL00028 
16.07 5.200e-10 185-202 BL00028 16.07 
6.700e-10 577-156 BL00028 16.07 3.057e- 
09 521-538 BL00028 16.07 6.143e-09 437- 
454 


287 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.250e-14 238-252 
PR00048A 1052 3.209e-12 266-280 
PR00048A 10.52 4.706e-12 490-504 
PR00048A 10.52 5.765e-12 462-476 
PR00048A 10.52 7.882e-12 630-644 
PR00048A 10.52 8.941e-12 518-532 
PR00048A 10.52 9.471e-12 164478 
PR00048A 10.52 5.737e-U 378-392 
PR00048A 10.52 7.158e-l 1 546-122 
PR00048B 6.02 7.23 le-1 1 180-190 
PR00048A 10.52 8.141e-ll 210-224 
PR00048A 10.52 9.053e-ll 294-308 
PR00048A 10.52 9.053e-ll 406420 
PR00048A 10.52 3.348e-10 322-336 
PR00048B 6.02 3.813e-10 338-348 
PR00048B 6.02 3.813e-10 394404 
PR00048B 6.02 3.813e-10 478-488 
PR00048B 6.02 4.938e-10 506-516 
PR00048A 10.52 8.043e-10 434-448 
PR00048B 6.02 8.875e-10 226-236 
PR00048B 6.02 8.875e-10 450-460 
PR00048B 6.02 l.OOOe-09 366-376 
PR00048B 6.02 l.OOOe-09 422-432 
PR00048A 10.52 3.520e-09 136-588 
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PR00048B 6.02 7.158e-09 590-600 
PR00048B 6.02 7.632e-09 310-320 

PP AAAASP £. CYJ *7 ^OaJ^Q 1 0A <70 
rKUUU*foJo O.uZ /.OjZe-uy JLZ4-D /z 

PR00048A 10.52 9.280e-09 350-364 




x JVUUU !\J 


nnrvnp mrni a tp p pra Trr a op 
SIGNATURE 


rKUuU/UU Ij.W 0.x4Je-lO M-Oj 

PR00070D 11.63 2.929e-15 112-127 


9RQ 
Zo? 


JDlAJUU/J 


Juinyoxoioiaie reductase proteins. 


■QT f\f\f\nc A 0*7 *7A 1 OAAa 1 l£ O OQ DT AAATCD 

1 1 AO Q Q 1 1/» 15^1 OT AAA*7C/~» O < 1 
1.3.4:* J.olJe-lj j1-03 DxJJUU /jU o.Dl 

Z.oOZe-1 1 00-/5* D./4 o.iuoe-iu 
111 10^ 






PT TMfr A T PPTFP AM rVMP \A A TTMfl 
FACTOR 9TF2 OPPP WyMATTTBF 


ppaao^ati ia *w o iaij&_ao o^/lots 

rlvUUZOUxJ 14.0Z y.lOje-Uy Z34-Z /o 


294 


PR00081 


GLUCOSE/RffilTOL 
SIGNATURE 


PR00081A 10.53 2.731e-09 39-57 


294 




A T POHOT TWHYDB OOFN A QiF 
SUPERFAMILY SIGN ATTIRE 


PP OniiRHP 17 1/5 (\dAAj* 11 101 011 


295 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 8.920e-09 276-290 
PR00806B 4.28 9^02e-09 275-289 


296 


PF00992 


Troponin. 


PF00992A 16.67 3.789e-10 553-588 


296 


BL00752 


XP A protein. 


BL00752B 19.17 8.144e-09 130-612 


29fi 


DLAJL luU 


. jsjnesin ngni cnain repeat proteins. 


HT A1 T/CAP IO <A Q CCIa AO CliC <QA 


298 


PR00511 


TEKTIN SIGNATURE 


PR00511C 7.86 4.214e-09 371-388 


•?UU 


I51AJUO J J 


xiivivjri/z pioteins. 


15LUUJDJi5 11.47 V.l / le-iy Z2o-27o 


301 


PR00240 


ALPHA-1A ADRENERGIC 


PR00240C 8.38 3.941e-10 316-336 


302 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-63 


302 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 




nnAAi oi 

JrKUUiyj 


> /r\7'/~\C "TKT TJTD A\Af ATT A TXT 

MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.545e-31 390-419 
PR00193C 12.60 1.209e-25 143-171 
PR00193B 11.69 2.543e-24 95-121 
PR00193A 15.41 6.885e-19 39-59 
PR00193E 19.47 3.291e-12 444-473 




BLO0675 


Sigma-54 interaction domain proteins 
ATP-binding region A proteins. 


BL00675A 24.86 3.475e-09 98-142 


306 


PR00239 


MOLLUSCAN RHODOPSIN C- 

TFPMTMAT TATT ^THMATTTPP 

X XixVLVxxi/N /\JLr 1/VLLr OxVJXN.rt.x UxvXv 


PR00239E 1.58 5.920e-ll 47-59 


306 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.923e-15 140-153 PD00066 
13.92 4.000e-14 112-125 PD00066 13.92 
1.391e-ll 84-97 PD00066 13.92 1.692e-10 
168-181 


306 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-14 96-1 13 BL00028 
16.07 4.130e-12 124-141 BL00028 16.07 
2.385e-U 68-85 BL00028 16.07 8.269e-ll 
180-197 BL00028 16.07 8.962e-ll 152-169 
BL00028 16.07 9.400e-10 319-336 


306 


PR00799 


ASPARTATE 

AMINOTRANSFERASE 

SIGNATURE 


PR00799D 16.46 5.125e-09 188-214 
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306 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 1.900e-13 81-91 PR00048A 
10.52 3.133e-13 65-79 PR00048A 10.52 
9357e-13 121-135 PR00048A 10.52 9.357e- 
13 149-163 PR00048B 6.02 2.688e-10 137- 
147 PR00048A 10.52 4.522e-10 279-293 
JPRUUU48A 10.52 5.o9oe-10 177-191 
PR00048B 6.02 9.438e-10 109-119 

DDAAH/IOA 1A CO *} Kft. AA rtl t AT 

rKUUU4oA Iv.dZ 3.1oUe-U9 93-107 
PR00048B 6.02 8.105e-O9 165-175 


307 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELL SI. 


PD00015A 8.90 6.400e-09 35-43 


310 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.662e-ll 80-114 


311 


BL00824 


Elongation factor 1 beta/beta'/delta 
chain proteins. 


BL00824C 14.58 1.000e-40 129-167 
BL00824D 14.04 6.192e-39 167-202 
BL00824B 9.21 2.080e-21 96-116 
BL00824E 12.49 3.333e-19 210-226 


312 


PR0O501 


KELCH REPEAT SIGNATURE 


PR00501B 18.88 7.632e-09 476491 
PR00501B 18.88 9.763e-09 523-538 


313 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.200e-30 43-82 


313 


PD00066 


PROTEIN ZINC-FINGER METAL-. 
BINDI 


PD00066 13.92 6.500e-13 439-452 PD00066 
13.92 8.000e-13 355-368 PD00066 13.92 
1.000e-12 383-396 PDO0066 13.92 4.000e- 
12 327-340 PD00066 13.92 5.714e-12 41 1- 
424 PD00066 13.92 8.435e-ll 299-31213.92 
5.800e-14 467-480 PD00066 


313 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.565e-12 451-468 BL00028 
16.07 2.957e-12 311-328 BL00028 16.07 
3.348e-12 367-384 BL00028 16.07 1.692e- 
11 423-440 BL00028 16.07 2.73 le-11 283- 
300 BL00028 16.07 2.800e-10 339-356 
BL00028 16.07 9.700e-10 199-216 BL00028 
16.07 1.000e-09 395412 BL00028 16.07 
4.086e-09 120-137 


313 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 5.909e-15 364-378 
PR00048A 10.52 2.286e-13 308-322 
PR00048A 10.52 7.429e-13 392-406 

PR0O048A 10.52 2.421e-ll 196-210 
PR00048A 10.52 1.000e-10 280-294 
PR0O048B 6.02 3.813e-10 324-334 
PR00048B 6.02 4.375e-10 464-474 
PR00048A 10.52 6.870e-10 336-350 
PR00048A 10.52 7J214e-10 420434 
PR00048B 6.02 7.750e-10 436446 
PR00048B 6.02 4.316e-09 380-390 


314 


PR00121 


SODIUM/POTASSIUM- 
TRANSPORTING ATPASE 
SIGNATURE 


PR00121D 16.72 1.577e-13 210-232 


314 


PR00119 


P-TYPE CATION-TRANSPORTING 


PR00119B 13.94 9.194e-12 217-232 
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o&\l JUL) 

NO: 


Database 
entry ID 


Description 


♦Results 






SIGNATURE 




314 


BL01228 


Hypothetical cof family proteins. 


BL01228D 17,44 3.400e-ll 646-671 


314 


DT AA1 CA 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154E 20.37 4.054e-13 486-527 
BLJ00154C 12.38 4i060e-12 213-232 
BL00154F 8.23 9.597e-ll 207-669 


315 


BL00888 


Cyclic nucleotide-binding domain 
proteins. 


BL00888B 14.79 1.692e-10 396-420 


315 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 8.338e-09 215-682 


315 


DM00668 


ZEJN. 


DM00668A 10.20 8.500e-09 155-170 


316 


PR00727 


BACTERIAL LEADER PEPTIDASE 1 
(S26) FAMILY SIGNATURE 


PR00727C 13.04 9.063e-16 108-128 
PR00727B 12.51 7.848e-ll 81-94 


316 


BL00501 


Signal peptidases I serine proteins. 


BL00501D 16.69 2.884e-13 108-128 
BL00501C 9.61 9.561e-ll 81-93 BL00501B 
12.58 7.000e-09 61-77 


317 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.471e-27 13-52 


317 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.235e-14 214-231 BL00028 
16.07 6.850e-13 270-287 BL00028 16.07 
9.100e-13 354-371 BL00O28 16.07 1.391e- 
12 158-175 BL00028 16.07 1.346e-ll 298- 
315 BL00028 16.07 3.769e-ll 242-259 
BL00028 16.07 6.538e-ll 380-397 BL00028 
16.07 8.800e-10 186-203 BL00028 16.07 
1.514e-09 326-343 


317 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 3.000e-12 199-209 
PR00048A 10.52 7.882e-12 351-365 
PR00048A 10.52 8.412e-12 323-337 
PR00048A 10.52 8.941e-12 239-253 
PR00048A 10.52 1.474e-ll 211-225 
PR00048A 10.52 6.211e-ll 155-169 
PR00048B 6.02 7.231e-ll 311-321 
PR00048A 10.52 8.141e-ll 267-281 
PR00048B 6.02 3.250e-10 339-349 
PR00048B 6.02 3.813e-10 255-265 
PR00048B 6.02 7.188e-10 283-293 

PR00048B 6.02 3.842e-09 393-403 
PR00048A 10.52 8.200e-09 295-309 


319 


PR00004 


ANAPHYLATOXIN DOMAIN 
SIGNATURE 


PR00004C 12.46 8. 141e-09 91-103 


320 


DM00060 


338 kwNEUREXIN ALPHA EI 
CYSTEINE. 


DM00060 6.92 6.500e-l 1 28-38 


320 


PR00010 


TYPE H EGF-LKE SIGNATURE 


PR00010C 11.16 7.667e-ll 44-55 


325 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 5.776e-12 344-363 
PR00020C 13.66 6.932e-10 417-429 


325 


BL00740 


MAM domain proteins. 


BL00740A 13.87 8.313e-l2 346-359 
BL00740B 19.76 8.500e-09 486-507 


325 


PD02080 


T-CELL GLYCOPROTEIN CDS 


PD02080B 20.69 9.621e-09 123-162 
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CHAIN SURFACE ALPHA PRE. 




326 


BL00048 


Protamine PI proteins. 


BL00048 6.39 6.128e-10 167-194 


326 


PF01140 


Matrix protein (MA), pl5. 


PF01140D 1534 9.791e-09 220-255 


327 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020C 13.66 2.615e-ll 143-593 
PR00020B 1532 5.059e-10 52-69 
PR00020B 1532 1.789e-09 553-132 


329 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.357e-32 847 


329 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 3.209e-14 284-301 BL00028 
16.07 4.600e-13 508-525 BL00028 16.07 
6.400e-13 368-385 BL00028 16.07 4.1 15e- 
11 396413 BL00028 16.074.115e-ll 424- 
441 BL00028 16.07 8.269e-ll 172-189 
BL00028 16.07 8.962e-ll 256-273 BL00028 
16.07 9.308e-ll 312-329 BL00028 16.07 
9.654e-ll 200-217 BL00028 16.07 3.100e- 
10 340-357 BL00028 16.07 5.500e-10 452- 
469 BL00028 16.07 9. 100e-10 480-497 
BL00028 16.07 4.086e-09 228-245 


329 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.000e-14 272-285 PD00066 
13.92 5.000e-l3 328-341 PD00066 13.92 
5.500e-13 188-201 PD00066 1352 5.500e- 
13 384-397 PD00066 13.92 6.000e-13 496- 
509 PD00066 13.92 6.143e-12 468-481 
PD00066 13.92 2.73 le-10 440-453 PD00066 
13.92 4.808e-10 160-173 PD00066 13.92 
5.500e-10 244-257 PD00066 13.92 7.000e- 
09 216-229 PD00066 13.92 7.000e-09 412- 
425 


332 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR 


PD02870B 18.83 5.871e-ll 468-501 


332 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 1 1.19 8.043e-10 275-289 


332 


BL00240 


Receptor tyrosine kinase class HI 
proteins. 


BL00240B 24.70 4.447e-09 430454 


333 


BL00738 


S-adenosyl-L-homocysteine hydrolase 
proteins. 


BL00738J 18.61 1.000e40 154-204 
BL00738H 23.08 5.320e-36 468-521 
BL00738F 12.23 7.261e-29 387419 
BL00738A 16.27 9.660e-27 216-256 
BL00738C 1633 7.923e-25 281-319 
BL00738G 1429 6.268e-23 446468 
BL00738B 1228 8.085e-21 256-281 
BL00738E 14.18 9200e-19 361-384 
BL007381 14.57 5.135e-17 545-583 
BL00738D 7.16 5.109e-13 335-350 


333 


BL00836 


Alanine dehydrogenase & pyridine 
nucleotide transhydrogenase. 


BL00836D 22.30 8.622e-09 424461 


337 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 3.148e-09 80-100 


342 


PD01823 


PROTEIN INTERGENIC REGION 


PD01823E 9.30 6.824e-12 108-121 
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sfo m 
NO: 


entry ID 


TlacnnnHnn 

isGscripuua 


AVC6U115 






MITOCHONDRION T. 


PD01 R23D 1 6 1 96Se-09 A£%J\1 




PP 00976 


PTRO^OA/fAT PPfYTFTM^91 

FAMILY SIGNATURE. 




343 






nX/T0091^ 10 43 1 A^ftf»-ftQ 473-^06 

DM00215 19.43 4.814e-09 463-496 


343 


PR00671 


INHIBIN BETA B CHAIN 


PR00671C 4.18 9.172e-09 707-727 


343 


PD01234 


PROTEIN NUCLEAR 

r>T> OlVyTAT^ A TXT TP ATsIC 


PD01234B 15.53 1.000e-08 482-500 


344 


PR00175 


MYOGLOBIN SIGNATURE 


PR00175B 9.02 2.143e-10 25-49 


"3 Ail 


t>t? A.n.0 i >i 
rKUUo 14 


dc, I A nAcMU VjriAJxJ UN 
SIGNATURE 


nn A AO 1 Afy A AA £L ClOa 1 A CC OA 

FKUU814C 9.20 6.523e-lU 66-84 


344 


PR00173 


ERYTHROCRUORJN FAMILY 

OT/^XT A TT TD I? 

MvjINAlUKr, 


PR00173A 15.91 7.158e-10 25-48 


344 


BL01033 


Globins profile. 


BL01033A 16.94 1.000e-16 25-47 

TJT A1 ftTlTi 1"i 01 O ZM C _ AA 0*7 AA 

BL01033B 13.81 8.615e-09 87-99 


344 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.04 4.194e-12 122-139 
PR00612B 10.92 3.483e-10 32-43 

HTJAA^IOA A A ilO* AA *T A DO 

PR00612D 9.76 9.438e-09 74-88 


345 


PR00814 


BETA HAEMOGLOBIN 
SIGNATURE 


PR00814C 920 6.523e-10 104-122 


345 


BL01033 


Globins profile. 


BLO1033A 16.94 5.125e-10 63-85 
BL01033B 13.81 8.615e-09 125-137 


345 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.04 4.194e-12 160-177 
PR00612B 10.92 3.483e-10 70-81 
PR00612D 9.76 9.438e-09 112-126 


349 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.133e-32 645 


350 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 6.318e-19 364-382 . 
BL00972D 22.55 7.968e-16 210-673 
BL00972B 9.45 1.600e-12 445-455 


350 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.008e-13 121-136 
PR00049D 0.00 7.375e-12 125-140 
PR00049D 0.00 5.916e-l 1 128-143 

T»T% f\f\/\ A ATX f\ /N r\ f rm A Ci 4 4 4 4AM 

PR00049D 0.00 6.748e-ll 122-137 

HT1AAA A AT\ A A A A onr 4 1 ilsr 1 i 1 

PR00049D0.00 9.395e-ll 126-141 

PPOOOAQ'n ft Oil 1 9RaWI0 110-1 34 

PR00049D 0.00 8.929e-10 127-142 
PR00049D 0.00 2.678e-09 129-144 
PR00049D 0.00 4.051e-09 123-138 
PR00049D 0.00 4.051e-09 124-139 
PR00049D 0.00 4.051e-09 130-145 


350 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 7.500e-09 124-145 


350 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.339e-10 108-141 
DM00215 19.43 7.268e-10 112-145 
DM00215 19.43 2.525e-09 106-139 
DM00215 19.43 9.695e-09 107-140 


350 


BL00048 


Protamine PI proteins. 


BL00048 6.39 9.888e-09 145-172 


352 


BL00518 


Zinc finger, C3HC4 type (RING 


BL00518 1223 4.429e-10 214-223 
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finger), proteins. 






DLAJUJ lo 


iCinc ringer, i^JrlUH type (KiNCj 
finger)* proteins. 


QT r\f\c i o 11 11 A A1t\~ m tTrt foo 

BL00518 12.23 4.429e-10 179-188 


354 


BL01009 


Extracellular proteuis SCP/Tpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009D 14.19 9.341e-17 160-181 
BL01009A 13.75 3.769e-14 80-98 
BL01009E 13.50 5.333e-14 194-210 
BL01009C 10.54 2.667e-ll 127-141 


354 


PR00838 


VENOM ALLERGEN 5 SIGNATURE 


PR00838G 16.07 2.304e-14 158-178 
PR00838D 8.73 4.452e-12 80-99 PR00838F 
10.11 7.532e-10 125-141 


354 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837C 17.21 7.429e-18 159-176 
PR00837A 14.77 1.900e-15 80-99 
PR00837D 11.12 2.198e-13 195-209 
PR00837B 11.64 3.483e-09 127-141 


356 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 8.500e-17 16-41 
BL00215B 10.44 4.900e-09 177-190 
BL00215A 15.82 6.786e-09 133-158 
BL00215B 10.44 7.300e-09 278-291 




PR 01107 1\ 


A/fTFrM^U'Pfc'KITl'DTAT PADDTUD 
JYU 1 UCliUIN DssXAL, wvKKlfcSJtC 

PPfYTPTM QWTrMATf TUP 


PR00926E 11.70 6.049e-13 91-110 
rKUUyzor 1 /. /•> 7.oUUe-l 1 24U-2o3 
PR00926F 17.75 5.219e-10 18-41 PR00926D 


357 


PR00326 


GTP1/OBG GTP-BLNDING PROTEIN 
FAMILY STfirNATTTRF 


PR00326A 8.75 7.150e-l 1 21-42 


357 


BL00113 


Aflenvlatp 1rinfl wntf*inc 


HT HOI HA 10 7A A £77<»_fiO 77 


357 


BL01128 


Shikimate kinase proteins. 


BL01 128A 18.84 7.802e-09 21-55 


357 


BL00300 


SRP54-type proteins GTP-binding 
domain proteins. 


BL00300B 20.56 1.000e-08 18-64 


358 




u uiv^uiLLn uarunxyi -lei i n u »3 1 nyoroiases 
family 2 proteins. 


rJLUUy/ZA 11.93 o.318e-19 324-342 
BL00972D 22.55 3.903e-16 170-194 
dimWIZd y.45 l.OUUe-lZ 405-415 


364 


DM00215 


PROLINE^RICH PROTEIN 3. 


DM00215 19.43 1.482e-10 355-388 


364 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 


PR00217C 10.91 4.600e-10 302-318 


365 


BL00518 


Zinc finger, C3HC4 type (RING 
nngerj, proteins. 


BL00518 12.23 2.800e-ll 125-134 


365 


BL00415 


Synapsins proteins. 


BL00415N 4.29 2.839e-09 387-431 


365 


DM00215 


PROL1NE-RICH PROTEIN 3 


DMOnilS 19 /tt 7 7fi»M» 11 177-d.in 
DM00215 19.43 8.412e-l 1 333-366 
DM00215 19.43 2.678e-09 356-389 
DM00215 19.43 5.138e-09 376-409 


365 


BL01102 


Prokaryotic dksA/teaR C4-type zinc 
finger. 


BL01102 15.99 5.705e-09 109-135 


365 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 5.959e-ll 407-428 
PR00211B 0.86 2.212e-10 401-422 
PR00211B 0.86 9.500e-09 336-357 


365 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.695e-09 335-350 


367 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 8.448e-09 2-23 
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370 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.353e-14 157-174 BL00028 
16.07 1.000e-13 269-286 BL00028 16.07 
8.200e-13 493-510 BL00028 16.07 3.739e- 
12 213-230 BL00028 16.07 6.478e-12 381- 
398 BL00028 16.07 L346e-U 185-202 
BL00028 16.07 2.385e-ll 129-146 BL00028 
16.07 2.385e-ll 325-342 BL00028 16.07 
5.154e-l 1241-258 BL00028 16.07 9.654e- 
11 437-454 BL00028 16.07 1.300e-l0 297- 
314 BL00028 16.07 9.100e-10 409-426 
BL00028 16.07 9.100e-10 465-482 


370 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.385e-15 229-242 PD00066 
13.92 3.077e-15 145-158 PD00066 13.92 
8.800e-14 173-186 PD00066 13.92 3.500e- 
13 369-382 PD00066 13.92 8.500e-13 341- 
354 PD00066 13.92 9.133e-12 397-410 
PD00066 13.92 2.174e-ll 313-326 PD00066 
13.92 3.348e-ll 453-466 PD00066 13.92 
3.739e-ll 481-494 PD00066 13.92 7.214e- 
11 257-270 PD00066 13.92 2.038e-10 425- 
438 PD00066 13.92 6.538e-10 201-214 
PD00066 13.92 5.200e-09 285-298 


370 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMALHL 


DM01970B 8.60 6.201e-09 265-278 


370 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.474e-ll 462-476 
PR00048A 10.52 6.684e-ll 182-196 
PR00048A 10.52 2.957e-10 434448 

T\n AAA i An V* AA ^ ^AA A *%*\ A 1 A A 

PR00048B 6.02 5.500e-10 338-348 
PR00048A 10.52 6.478e-10 350-364 
PR00048B 6.02 6.1 87e-10 226-236 
PR00048A 10.52 6.870e-10 490-504 
FR00048A 10.52 8.826e-10 406-420 
PR00048B 6.02 3.842e-09 170-180 

DDAAAAQT} A AO A 11 £a_AO 1*7< 

PR00048B 6.02 4.789e-09 478-488 
PR00048B 6.02 7.632e-09 142-152 

rKUUlrtOA 1U. o. YLJXr*Ji lZO-l'HJ 

PR00048B 6.02 9.053e-09 450-460 


371 


BL01019 


ADP-ribosylation factors family 
proteins. 


BL01019B 19.49 6J276e-21 95-150 
BL01019A 13.20 8.453e-17 51-91 


371 


PR00328 


GTP-BMDING SARI PROTEIN 
SIGNATURE 


PR00328C 13.16 8.481e-13 78-104 
PR00328D 12.56 3.357e-ll 123-145 


371 


BL0U15 


GTP-binding nuclear protein ran 
proteins. 


BL01115A 10.22 8.119e-ll 21-65 


373 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.522e-12 208-225 


373 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.000&-13 194-207 PD00066 
13.92 7.000e-13 224-237 PD00066 13.92 
7.000e-12 254-267 


373 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 1.391e-10 205-219 
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Database 
entry ID 


DfKcrinrinn 








SIGNATURE 


PR00048B 6 09 6 0fi3e-10 991 -9^1 


374 


PR00308 


TYPE I ANTIFREEZE PROTEIN 

* X A Xv X -A*-! 'I X XI. X\.l il if X 1VW X 1 f 1 1 ^1 

SIGNATURE 


PROO^ORA 5 90 7 9Rfte-1 1 SH-S4K 
PR00308A 5.90 8.835e-09 534-549 


377 


PD02784 


PROTEIN NUCLEAR 

x xw/ x 1 - 1 1 ^* i^i u\/X(uru\. 

RIBONUCLEOPROTEIN. 


Pnft97R4B 96 4a" 7 S^Jte-OQ 147-100 


378 




PROTFTN RFPF AT 

NEI JROFIL AMENT TRTPL 


pnoi ^i a a £Q 7 4^Qf»_no 1 ss. 1 


3S0 


PF00094 


von Willebrand factor type D domain 
proteins. 


PF00094C 12.88 1.918e-09 43-53 j 


380 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 3.667e-ll 120-135 
BL01208B 15.83 1.973e-09 178-193 


joU 


rJJUZlJo 


rKxlCUKoUK KjLx UUJrKU 1 rSJJN 
SIGNAL CELL 


PD02138A 27.60 9.057e-0jJ 20-69 




DT A1 1 AC 


Ribosomal protein L35 Ae proteins. 


BL01105B 12.95 7.930e-13 43-83 


384 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.009.205e-lO 10-25 PR00049D 
0.00 1.915e-09 9-24 


JOJ 


DT All K 


GTP-binding nuclear protein tan 
proteins. 


BL01 1 15A 10.22 8.909e-13 34-78 


385 


BL00905 


GTP1/OBG femily proteins. 


BL00905D 15.00 5.313e-09 140-155 


JOJ 




1 KAIN orUKJVxliNCj rKO 1 rSiiN rZ 1 
RAS SIGNATURE 


PR00449C 17.27 3JZ09e-19 75-98 
PR00449A 13.20 1.000e-17 34-56 

rKUU44yL/ 1U. /V J.3ooe-l J 139-153 
PP0044QP: 14 ^4 51 1£4a 1 1 S*7 74 PPOn44QX7 

13.50 8.286e-09 174-197 


386 


BL00115 


xwiiiLcu jfutiL* i\_i >.rv p viyixici ttb e ju. 

heptapeptide repeat proteins. 


P.T OA1 1 ^7 1 n 7 Q77*» 1 A ^07_44A 


386 


PR00041 

x iwuvn x 


CAMP RFSPONSF FT FMFMT 
BINDING (CREB i PROTEIN 
SIGNATURE 


PP00041T7 R Q U^J^Q 0^^_'574 


388 


PF00646 


E-Hax H Amain nrAt^inc 


PF00a^J/«A 14 V7 Q CilAe* 10 9ft_49 


389 


BL00036 


bZIP transcription factors basic domain 
proteins. 


BL00036 9.02 6.294e-12 81-94 


389 


PR00042 


PflSl TP ATSJSFOP MTMO PP HTRTM 1 

SIGNATURE 


PPnoo47r t fi 70 ft mcp 1 1 ft*> oo pponn/ion 
8.97 9.895e-10 100-122 




JOX-AJUZrZi't 


v/iftuirin iigni cnain pro xe ins. 




389 


PR00043 


JUN TRANSCRIPTION FACTOR 
SIGNATURE 


PR00043B 8.73 9.596e-09 81-98 


390 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 2.500e-13 85-107 


391 


BL00564 


Argininosuccinate synthase proteins. 


BL00564A 19.93 6.114e^)9 7-44 


392 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 230-244 
PR00048A 10.52 4.316e-ll 202-216 


392 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.125e-15 205-222 BL00028 
16.07 1.391e-12 233-250 BL00028 16.07 
3.400e-10 177-194 


392 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 3.000e-13 193-206 PD00066 
13.92 3.423e-10 221-234 


393 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 1.391e-16 132-154 
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393 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 8.800e-10 761-778 BL00028 
16.07 2.029e-09 789-806 


393 


¥\t> r\r\r\ A o 

PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.800e-09 758-772 


394 


PR00501 


KELCH REPEAT SIGNATURE 


1\tt AA^A4 A A /H r* +t A AA A A #»A*V t* f + 

PR00501A 825 1.409e-09 537-551 


394 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL. DIHYDROPTERIDINE 


DM00099B 14.73 4.375e-09 415-425 


395 


PR00399 


SYNAPTOTAGMIN SIGNATURE 


PR00399A9J2 3.133e-19 146-162 
PR00399C 12.82 8.200e-17 222-238 
PR00399B 14.27 7.750e-16 161-175 
PR00399D 14.48 4.000e-14 242-253 


395 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 8.269e-13 201-215 
PR00360A 14.59 2.800e-12 174-187 
PR00360B 13.61 5.217e-12 340-354 
PR00360A 14.59 5.207e-10311-324 


395 


PF00168 


C2 domain proteins. 


PF00168C 27.49 5.500e-18 323-349 
PF00168B 11.83 2.000e-09 306-317 


396 


Y*%T A^ A4 A 

BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013A 25.14 7.231e-21 558-156 

T\ T" A * df\ 4 A Tl 4 *■ A A 4 AAA * 4 4 a gm 4 a 

BL01013B 11.33 1.000e-ll 185-196 


396 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 3.534e-10 52-107 


396 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


PD00078B 13.14 9.000e-ll 173-186 
PD00078B 13.14 3.739e-09 78-91 
PD00078B 13.14 4.130e-09 45-58 


396 


PF00023 


Ank repeat proteins. 


T%T*>/\ AAA 4 A A A A A^V#4 4 4 A A /■* A V^Y"*A A A A 

PF00023B 1420 3.077e-ll 48-58 PF00023B 
14.20 3.769e-ll 176-186 PF00023A 16.03 

j4oa« Art or i /m 

7.429e-09 85-101 


397 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 1.750e-10 55-71 


397 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 4.455e-ll 55-110 
PF00791B 28.49 7291e-10 88-143 


398 


BL00422 


Granins proteins. 


BL00422C 16.18 5.787e-10 134-162 


400 


PR00450 


RECOVERED FAMILY SIGNATURE 


PR00450D 16.58 8.986e-U 161-181 


400 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479B 12.57 4.273e-15 287-303 
BL00479A 19.86 2.667e-14 261-284 
BL00479B 12.57 1.360e-10 351-367 


400 


PR00171 


CLASS EH CYTOCHROME C 
SIGNATURE 


PR00171D 7.30 9.419e-10 334-342 


Af\(\ 


DT AAA1 o 


EF-hand calcium-binding domain 
proteins. 


BLQ0018 7.41 3.348e-09 223-236 


400 


PP00781 


Diacyiglycerol kinase catalytic domain 
proteins (presumed). 


PF00781F 16.43 1.000e-40 600-199 
PF00781B 12.07 8.364e-35 454-486 
PF00781D 11.11 3.077e-30 532-118 
PF00781C 9.69 5.034e-19 506-521 
PF00781E 12.45 2.385&-17 124-583 
PF00781G 10.09 6211e-17 678-692 
PF00781H 1220 1.750e-16 770-782 
PF00781A 6.42 3.667e-09 354-360 


401 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.407e-09 325-340 


402 


DM01117 


2 kw TRANSPOSASE WITHIN 


DM01117A 11.17 7.750e-09 364-382 
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TRANSPOSITION VASOTOCIN. 




403 


DM01206 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10 69 9 286e-12 724-744 
DM01206B 10.69 3.466e-10 726-746 
DM01206B 10.69 9.630e-10 722-742 
DM01206B 10.69 7.152e-09 718-738 
DM01206B 10.69 8.861e-09 728-748 


403 


BL00048 


Protamine PI proteins. 


BL00048 6.39 4.197e-10 722-749 BL00048 
6.39 5.500e-10 731-758 BL00048 6.39 
fi 329e-10 729-756 BT/)f>048 fi 39 9 17le-T0 
730-757 BL00048 639 4.038e-09 728-755 
BL00048 6.39 8.538e-09 724-751 BL00048 
6.39 9.438e-09 716-743 


403 


PD00289 


PROTEIN SID DOMAIN REPEAT 

PRF<JVTMA 

riVDO I IN /A.. 


PD00289 9.97 9.690e-09 130-144 


404 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.353e-27 31-70 


404 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 5.154e-15 274-287 PD00066 
13.92 7.600e-14 246-259 PD00066 13.92 
8.200e-14 302-315 PD00066 13.92 3.143e- 
12218-231 PD00066 13.92 4.000e-12 190- 
203 PD00066 13.92 2.800e-09 330-343 


*t\JH 




/ADC linger, lziu vpe, aomain 
proteins. 


"RT AnOOR 1 A fY7 7 9/;ip 10 01 A 0A7 T*T AAAOfl 

16.07 9.171e-12 342-359 BL00028 16.07 
4.300e-10 314-331 BL00028 16.07 7.000e- 
10 174-191 BL00028 16.07 3.314e-09 202- 
219 BL00028 16.07 6.400e-09 286-303 


404 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e-13 339-353 
PR00048A 10.52 3.209e-12 227-241 

PR00048A 10.52 4.522e-10 171-185 
PR00048B 6.02 2.895e-09 299-309 
PR00048A 10.52 4.600e-09 199-213 
PR00048B 6.02 1.000e-08 187-197 
PR00048B 6.02 1.000e-08 271-281 


406 


BL00610 


Sodiumaieurotransmitter symporter 
iainiiy proteins. 


BL00610A 17.73 1.000e-40 68-118 

"RT AA£1AT* 01 1 AAAo^/lA 110 ICO 

BL00610C 12.94 1.000e-40 225-277 
BL00610D 2057 1.000e-40 291-344 
BL00610F 29.02 6.143e-36 540-157 
BL00610E 20.34 3.209e-35 448-491 
BL00610G 12.89 2.200e-15 173-196 


406 


PR00176 


SODIUM/NEUROTRANSMTTTER 
SYMPORTER SIGNATURE 


PR00176C 10.84 6.226e-23 141-168 
PR00176A 16.82 1.450e-22 68-90 PR00176F 
10.73 8.667e-20 452-472 PR00176B 7.31 
7.000e-18 97-117 PR00176D 9.02 1.000e-17 
252-270 PR00176E 11.41 2.756e-15 334-355 
PR00176H 15.27 7.353e-15 131-590 
PR00176G 12.48 5.615e-14 529-112 


407 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 1357 5.304e-09 111-121 
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408 


PR00187 


DNAJ PROTEIN FAMILY 


PR00187B 13.48 L800e-16 45-66 

DDAA1 Q1 A 11 QA £ TAAa 11 1< 

rKVUlo/A 1Z.S4 O./We-lZ 


408 


BL00198 


Nt-dnaJ domain proteins. 


BLO0198B 15.11 9.217e-15 45-66 
BL00198A 8.072.459e-ll 19-36 


409 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 4.136e-l 1 246-268 


409 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.735e-14 11-36 
BL00215A 15.82 5.787e-ll 108-133 
BL00215B 10.44 6.211e-ll 258-271 
BL00215A 15.82 5.018e-09 211-236 


409 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926D 10.53 5.355e-09 19-38 


410 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BMDL 


PD00O66 13.92 6.400e-17 411-424 PD00066 
13.92 8.200e-17 327-340 PD00066 13.92 








5.154e-15 271-284 PD00066 13.92 2.800e- 
14215-228 PD00066 13.92 9.000e-13 355- 
368 PD00066 13.92 6.143e-12 439-452 
PD00O66 13.92 6.478e-ll 187-200 PD00066 
13.92 9.217e-l 1243-256 


410 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.588e-14 227-244 BL00028 
16.07 6.824e-14 395-412 BL00028 16.07 
7.882e-14 171-188 BL00028 16.07 2.350&- 
13 339-356 BL00028 16.07 7.300e-13 283- 
300 BL00028 16.07 7.300e-13 367-384 
BL00028 16.07 2.565e-12 423-440 BL00028 
16.07 7.261e-12 199-216 BL00028 16.07 
7.261e-12 311-328 BL00028 16.07 8.435e- 
12 451-468 BL00028 16.07 2.038e-ll 255- 
272 BL00028 16.07 9.400e-10 143-160 


410 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 3. 250e-14 280-294 
PR00048A 10.52 8.500e-14 336-350 
PR00048A 10.52 7.429e-13 252-266 
PR00048A 10.52 8.714e-13 448-462 
PR00048A 10.52 9.357e-I3 392-406 
PR00048A 10.52 1.000e-12 168-182 
PR00048A 10.52 2.059e-12 420-434 
PR00048B 6.02 8.615e-ll 408-418 
i*Kuu Woxy o.uz /. 1 oo e- 1 u zoo-z / o 
PR00048B 6.02 7.188e-10 380-390 
PR00048B 6.02 9.438e-10 296-306 
PR00048B 6.02 1.000e-09 324-334 
PR00048B 6.02 1.474e-09 352-362 
PR00048B 6.02 3.842e49 212-222 
PR00048B 6.02 5.263e-09 436446 


411 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 5.500e-10 63-76 


413 


PR00014 


FTBRONECTIN TYPE HI REPEAT 
SIGNATURE 


PR00014C 15.44 4.600e-10 73-92 


414 


PR00806 


VINCULIN SIGNATURE 


PR00806A 6.63 1.493e-09 785-796 


414 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 4.240e-09 41-55 
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QTfTNT ATT TOT7 




414 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.546e-ll 781-796 
PR00049D 0.00 1.205e-10 263-278 
PR00049D 0.00 4.356e4)9 785-800 


A 1 A 


XXX (\(\A \ 1 


in euromoauun (vjAr-43j proteins. 


UT f\r\A 1 AT\ 1 £i C.A A rTl A AA .4 A A Ani 

BL00412D 16.54 4.673e-09 420-471 


414 


BL00422 


Granins proteins. 


BL00422C 16.18 6.318e-ll 439-467 

TTVT AA A A A/""* 1^ 1 A A AAA- i A A A A AATO 

BL00422C 16.18 9.809e-10 440-468 
BL00422C 16.18 6.294e-09 441-469 
BUJ0422C 16.18 6.209e-09 438-466 


414 


PR00910 


LUTEOV1RUS ORF6 PROTEIN 
MCjNAIUKJB 


PR00910A 2.51 8.179e-09 265-278 


414 


DM00215 


PROUNE-RICH PROTEIN 3. 


DM00215 19.43 4.203e-09 770-803 
DM00215 19.43 9.085e-09 245-278 


414 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 1257e-0944-61 BL00028 
16.07 2.543e-09 175-192 BL00028 16.07 
6.143e-09 119-136 BL00028 16.07 9.743e- 

r\r\ i A\n i /T /i 

09 147-164 


415 


PF00622 


ReceptorDomain in SPla and the 
KYanodine 


PF00622B 21.00 1.000e-13 331-353 

TITJAA^I^/^ 

PF00622C 


415 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 3.400e-ll 31-40 


416 


PF00780 


Domain found in NIK1 -like kinases, 
mouse citron and yeast ROM. 


PF00780B 23.03 5.929e-33 442^85 


410 


JrKUUlUy 


TVD f\CTVTC VTTvT A CT3 PATAT "\rnTTO 

I Y KUolN a IsJLfN Aoxi CA 1 AL/x 1 1C | 
DOMAIN SIGNATURE 


T1TJ AA1 AA"Q 1 A An f A1C. 1A All AAA 

PR00109B 12.27 5.235e-12 211-230 


410 


&L\)Vl\J/ 


Protein kinases ATP-binding region 
proteins. 


UT AA1 A"7 A 1 O A A C AAA. OA All Ail A 

BL00107A 18.39 5.200e-22 21 1-242 

UT AA1ATD n H A aao a ia aqa aaa 

dIaJO 10/15 13. J 1 y.io<$e-i2 2o3-2yy 


416 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BL00239B 25.15 5.164e-10 145-193 


416 


BL00915 


Phosphatidylinositol 3- and 4-kinases 
proteins. 


BL00915C 22.43 9.357e-10 203-242 • 

• 


417 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 1.482e-14 41-59 

XXK f\f\fY)1T\ OA CA *> 11% lO 1A1 OIC 

dIaIUUZIU 24.30 Z.lzze-lZ 15/3-235 


417 


PR00722 


CHYMOTRYPSIN SERINE 

DBHTn A CP I?A \iTTT V /'CI \ 

SIGNATURE 


PR00722A 12.27 7.5 17e-14 42-58 

DDAATIOD in CI A 1>IA-» 1AA*7 1 1 A 

rKU0722B 12.51 3. 143e-10 97-112 


417 


BL00134 


Serine Timtea<!e<i trvn^in fiamilv 

histidine proteins. 


"RL001 **4 A 1 1 Q6 fi 4£4e-1 a* 41 -5R 
BL00134C 13.45 2.059e-09 221-235 


417 


BL00495 


Apple domain proteins. 


BL00495O 13.75 2.440e-09 212-241 


417 


BL00672 


Serine proteases, V8 family, histidine 
proteins. 


BL00672A 9.79 9.520e-09 41-57 


417 


PR00839 


V8 SERINE PROTEASE FAMILY 
SIGNATURE 


PR00839B 11.20 9.753e-09 41-59 


418 


BL01207 


Glypicans proteins. 


BL01207B 23.69 9.122e-28 191-237 
BL01207A 12.21 1.000e-16 62-78 


423 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870D 15.74 4.351e-09 693-728 


423 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.696e-09 793-803 
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424 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 5.041e-09 13-59 


425 


T>T AA1 n*T 

BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.141e-18 217-248 


425 


BL00240 


Receptor tyrosine kinase class IE 
proteins. 


BL00240E 11.56 6.040e-10 203-241 


425 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 5.814e-14 217-236 
PR00109A 15.00 1.730e-09 182-196 


428 


PR00141 


PROTEASOME COMPONENT 
SIGNATURE 


PR00141C 11.15 6.333e-12 234-246 
PR00141D 12.45 8.615e-12 259-271 
PR00141B 11.15 9.561e-12 223-235 
PR00141A 11.362.050e-ll 102-118 


428 


Y> Y AAA ^ A 

BL00854 


Proteasome B-type subunits proteins. 


BL00854A 33.93 1.383e-19 99-145 
BL00854C 29.92 5.235e-14 206-235 
BL00854D 13.76 2.800e^09 257-267 


429 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 9.413e-17 59-81 
PR00245C 7.84 7.500e-16 238-254 
PR00245E 12.40 2.500e-12 291-306 

T»T» A AO A CT> 1A ^ O A 1 1^» ft 1 nn 1 A/\ 

PR00245B 10.38 9.1 12e-ll 177-192 


429 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 7.120e-12 199-223 
PR00237C 15.69 1.225e-09 104-127 1 


429 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.727e-14 90-130 
BL00237D 11.23 1.273e-09 282-299 


429 


PR00534 


MELANOCORTIN RECEPTOR 
FAMILY SIGNATURE 


PR00534A 11.49 6.400e-09 51-64 


430 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 1.000e-ll 87-100 . 


430 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.706e-14 474-491 BL00028 
16.07 1.771e-09 502-519 


430 


PDOOOoo 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 4.300e-09 490-503 


430 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.600e-09 499-513 


433 


BL00086 


Cytochrome P450 cysteine heme-iron 
ligand proteins. 


BL00086 20.87 3.209e-23 430-462 


433 


PR00465 


E-CLASS P450 GROUP IV 

/^i Tv'^'l-V T A fill iiv "I - * 

SIGNATURE 


PR00465F 13.37 1.360e-ll 400-419 


433 


PR00359 


B-CLASS P450 SIGNATURE 


PR00359G 1 1.22 8.071e-10 401-417 

DDAniCQU OA OAT 1 On« AO 1*71 A A 1 


433 


PR00385 


P450 SUPERFAMILY SIGNATURE 


PRO0385E 12.66 8.800e-U 440-452 
PR00385D 13.11 4.429e-10 431-441 
PR00385A 14.97 5.865e-09 302-320 


433 


PR00464 


E-CLASS P450 GROUP H 
SIGNATURE 


PR00464G 12.41 9.000e-10 405-421 
PR00464D 17.40 1.191e-09 320-338 
PR00464E 18.28 6.946e-09 349-370 
PR00464H 13.32 7.750e-09 427-441 
PR00464C 18.84 9.014e4)9 291-320 
PR004641 14.64 9.481e-09 44(M64 


434 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 7.943e-19 101-151 


434 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12.76 3.593e-ll 413^35 
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435 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.429e-10 10-25 


435 


BL00028 


Zinc finger C2H2 tvoe domain 
proteins. 


BL00028 16 07 4 150e-13 138-593 BL00028 
16.07 6.850e-13 1010-1027 BL00028 16.07 
6.087e-12 982-999 BL00028 16.07 8.615e- 
1 1 846-863 BL00028 16 07 3 100e-10 317- 
334 BL00028 16.07 7.000e-10 170-187 
BL00028 16.07 8.500e-10 289-306 BL00028 

Ifi 07 8 Rft0p_1 ft Sd.R.SrtS 


435 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.600e-14 998-1011 
PD00066 13.92 1.000e-ll 305-318 PD00066 
13.92 8.826e-ll 564-577 PD00066 13.92 
3.400e-09 862-875 


435 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 5.329e-09 177-192 
PR00456E 3.06 5.899e-09 140-155 


435 


BL00999 


Streptomyces subtilisin-type inhibitors 
proteins. 


BL00999A 14.95 7.223e-09 461-499 


435 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.357e-13 573-587 
PR00048A 10.52 2.42 le-11 1007-1021 

T}T> AHA/I OT> *C AO 1 IOC*. lftr/1 1*3^ 

rKUUlHso 0.U2 2.125&-10 501-133 
PR00048A 10.52 8.043e-10 314-328 
PR00048B6.02 1.000e-09 995-1005 
PR00048B 6.02 6.684e-09 302-312 
PR00048A 10.52 9.280e-09 167-181 






i^t 17 a ptadv t> xsr^xjjyrrvD 
UJ-rrAOlUKY KcUcrlUK 

SIGNATURE 


WfUYiAG A 1Q A^ 1 nana. 11 1AA 1*J1 

rKUUZ43A 15.U3 2.007e-23 1UU-122 
PR00245C7.84 l.783e-l4 232-248 
PR00245D 10.47 7.070e-10 268-280 


436 


PR00237 


RHODOPSIN-LIKE GPCR 


PR00237C 15.69 8.500e-ll 145-168 






SUPERFAMILY SIGNATURE 


PR00237G 19.63 6.023e-09 266-293 


436 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.161e-15 131-171 
BL00237D 11.23 8.091e-O9 276-293 


All 


rKUUZOZ 


TT 1 fUnnij X2 W/TTX "V OT/TXT A TT TD TJ 

LH/tlDUr FAMILY MuxNAIUKi} 


DTOAAOJCO A *%Q 1 AAA. AO OA 1 AO 

rKUU2o2A.2o.Zo 1.0UUe-0e oO-lUS 


438 


BL00884 


Osteopontin proteins. 


BL00884B 12.47 1.000e-40 50-94 
BL00884C 22.45 6.187e-39 131-173 
BL00884A 11.35 5.846e-32 1-31 BL00884E 
11.04 8.364e-23 273-295 BL00884D 8.79 
3.323e-18 255-272 


438 


PR00216 


OSTEOPONTIN SIGNATURE 


PR00216B 7.89 4.553e-34 37-67 PR00216A 
10.94 8.054e-33 2-32 PR00216C 9.63 
2.565e-32 67-93 PR00216G 12.39 8.676e-27 
238-264 PR00216H 7.41 5.295e-22 273-293 
PR00216F 11.79 3.133e-21 164-183 
PR00216D 2.74 5.800e-18 104-1 19 
PR00216E 8.44 4.405e-16 132-147 
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SEQ 
ID 


Ptam Model 


Description 


E-vahie 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 


1 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3- 
Htype 


1.8e-05 


31.6 


1 


412-438 


1 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


2e4)5 


21.8 


1 


14-52 


3 


EMP24_GP25L 


emp24/gp25IVp24 family 


4.1e-105 


362.6 


1 


22-235 


6 


WW 


WW domain 


l,2e~05 


32.2 


1 


45-75 


7 


WW 


WW domain 


L2e-05 


32.2 


1 


45-75 


8 


Aajrans 


Transmembrane amino acid 
transporter protein 


9.6e-64 


225.2 


1 


71-451 


9 


Fe-ADH 


Iron-containing alcohol 
dehydrogenase 


9.9e-35 


124.5 . 


2 


4-205:228- 
255 


10 


Fe-ADH 


iron-containing alcohol 
dehydrogenase 


9.9e-35 


124.5 


2 


52-253:276- 
303 


11 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.016 


-2.1 


1 


257-356 


12 


spectrin 


Spectrin repeat 


1.3e-10 


43.6 


3 


11-87:90- 
197:200-291 


13 


RibosomalJL18ae 


Ribosomal L18ae protein 
family 


1.9e-128 


440.1 


1 


6-176 


14 


RibosomaU-,31e 


Ribosomal protein 13 le 


2.4e-47 


170.7 


1 


72-166 


15 


_£ /WITT 

zf-CCCH 


Zmc finger C-x8-C-x5-C-x3- 

TT a 

H type 


7.8e-16 


66.0 




342-367:371- 
396:398-420 


10 




MYND finger 


1.4e-13 


58.5 


1 


52-90 


1*7 

17 


Sterile 


Male sterility protein 


l.le-51 


185.1 




254-446 


18 


MgtE 


Divalent cation transporter 


8.6e-39 


142.3 


-| 


138-274:352- 
499 


10 
iy 


Kap_UAr 


Rap/ran-GAP 


2e-124 


426.7 


1 


400-588 


iy 




PDZ domain (Also known as 
JUrlK or \jL\ir) 


2.4e-06 


34.5 




726-800 


20 


Rap_GAP 


Rap/ran-GAP 


2e-124 


426.7 


i 


40O-588 




JrJJz* 


PDZ domain (Also known as 
DrlK or CjLCjr) 


2.4e-06 


34.5 




726-800 


22 


SCAN 


SCAN domain 


1.5e-23 


91.7 


i 


165-238 




KnoCiAP 


RhoGAP domain 


3e-58 


206.9 


1 


497-649 


Z3 


rCH. 


Fes/CIP4 homology domam 


1.2e-18 


75.4 




22-121 


23 


SH3 


SH3 domain 


2.6e-ll 


51.0 


1 


723-777 


24 


flflh Ttrxc 


AjJlUs-Ullllllug UCliyUlUgCliabCo 


juoe-uj 






ZU-jJO 


25 1 


UDPGT 


UDP-glucoronosyl and UDP- 
glucosyl transferas 


1.6e-84 


294.3 




26-467 


28 


RibosomaUL6e 


Ribosomal protein L6e 


4.3e-77 


269.5 




109-239 


29 


RibosomaLX.il 


Ribosomal protein LI 1 


4.9e-64 


226.2 




13-144 


30 


tRNA-synt_le 


tRNA synthetases class I (C) 


1.6e-137 


470.2 




64-538 


32 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.00041 


17.6 




33-72:165- 
185 


34 


ras 


Ras family 


L4e-77 


271.2 




35-235 


34 


arf 


ADP-ribosyiation factor 
family 


9.3e-05 


-56.3 




17-198 


36 


SET 


SET domain 


3.2e-05 


10.0 


1 


209-342 


36 


MORN 


MORN repeat 


0.006 


232 


3 


36-5839- 
81:106-128 
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SEQ 
ID 


Pf am Model 


Description 


E-value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 


37 


laininin_G 


Laminin G domain 


L5e-U 


44.7 




55-174 


n 
j i 


xi ax* 


cOr-UKe domain 


0,0033 


24.1 


1 


202-234 


19 
JO 


Sema 


Sema domain 


L7e-127 


436.9 


1 


56-489 


38 


Plexin_repeat 


Plexin repeat 


le-06 


35.7 


- 1 


507-563 


1Q 

io 


ig 


Immunoglobulin domain 


0.0023 


15.9 




582-639 


38 


integrinJB 


Integrins, beta chain 


0.084 


6.1 


1 


513-527 


40 


filament 


Intermediate filament protein 


1.6e-138 


473.6 




129-442 


41 


KeratiiLB2 


Keratin, high sulfur B2 
protein 


1.8e-18 


74.8 


2 


2-138:139- 
240 


44 


sushi 


Sushi domain (SCR repeat) 


3.8e-06 


33.9 


4 


1396- 

1459:1464- 

1521:1525- 

1590:1595- 

1646 


45 


profilin 


Profilin 


4.1e-13 


51.7 


1 


10-124 


47 


ubiquitin 


Ubiquitin family 


0.00033 


205 


1 


31-99 


48 


BTB 


BTB/POZ domain 


2.6e-21 


84.2 


1 


80-196 


48 


Kelch 


Kelch motif 


2.6e-20 


80.9 


4 


336-382:384- 

430:432- 

478:582-635 


AO 

48 


SCP 


SCP-hke extracellular protem 


0.015 


13.0 


1 


1-35 


49 


serpin 


Serpin (serine protease 
inhibitor) 


a _ inn 

2.4e-178 


£Lt\tT A 

605.4 


1 


59-432 


5U 


1-DOX 


i-DOX 


3.6e-lz5 


429.2 


1 


1 Af\ 11 1 

140-331 


DZ 




7 transmembrane receptor 
(rhodopsin family) 


l.Ze-17 


CO 1 

58.3 


2 


132-228:337- 

1AA 

344 


53 


CSD 


'Cold-shock 1 DNA-binding 
domain 


L8e-16 


63.6 


1 


42-112 






Zinc knuckle 


a aaai o 
U.UUU1Z 


Zo.o 


Z 


176 




ifi 


Immunoglobulin domain 


Z.5e-U/ 


Zo./ 


i 
1 


34-iuy 


55 


Rap_GAP 


Rap/ran-GAP 


5e-18 


733 


l 


287-466 


5/ 


G-gamma 


(JUL domain 


l.oe-ll 


39.4 


2 


>fA TA.1AA 

49-70:109- 


55 


T-box 


T-box 


8.9e-114 


391.4 


1 


101-302 


59 


Gag_pl0 


Retroviral GAG p 10 protein 


92&Q6 


23.7 


1 


82-171 


01 


60s_ribosomal 


60s Acidic ribosomal protein 


0.0089 


12.0 


1 


1-22 


62 


UPARJ.Y6 


u-PAR/Ly-6 domain 


5.4e-05 


22.3 


1 


8-51 


63 


RibosomalJL30 


Ribosomal protem L30p/L7e 


0.00042 


18.5 


1 


65-93 


0*r 


iuamem 


Intermediate filament protein 


x.ie-/o 


11 A Q 


i 
z 


l01-33o:339- 
426 


65 


Ribosomal_S6 


Ribosomal protein S6 


0.00082 


7.5 


1 


2-96 


66 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


5.1e-09 


43.4 


1 


158-250 


67 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.005 


14.0 


1 


92-118 


68 


G-patch 


G-patch domain 


6.86-07 


36.3 


1 


26-70 


69 


Keratin_B2 


Keratin, high sulfur B2 
protein 


0.037 


-455 


1 


10-155 


83 




Immunoglobulin domain 


8.5e~09 


33.4 


2 


34-89:119- 
187 


86 


zf-C2H2 


Zinc finger, C2H2 type 


2.2V71 


250.6 


17 


182-204:210- 
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SEQ 
ID 


Pfam Model 


Description 


E- value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 














232:237- 
260:265- 
288:315- 
337:343- 
365:369- 
392:653- 
675:681- 
704:709- 
733:741- 
764:791- 
814:820- 
842:848- 
870:877- 
899:905- 
928:952-975 


87 


ig 


Immunoglobulin domain 


2.7e-35 


118.7 


6 


36-121:162- 

249:292- 

375:422- 

517:564- 

657:704-795 


88 


MAP1JX3 


Microtubule associated 
protein 1A/1B, light 


9.4e-79 


275.0 


1 


118-221 


89 


WD40 


WD domain, G-beta repeat 


1.6e-12 


55.1 


4 


173-215:221- 
263:269- 
305:1103- 
1140 


90 


FKBP 


FKBP-type peptidyl-proiyi 
cis-trans isomeras 


1.2e-59 


198.9 


1 


66-160 


92 


RPEL 


RPEL repeat 


6.5e-18 


73.0 


2 


513-538:551- 
576 


93 


transkeLpyr 


Tr ansketolase, pyridine 
binding domain 


4.6e-65 


229.6 


1 


568-773 


93 


El_dehydrog 


Dehydrogenase El 
component 


8.7e-23 


89.1 


1 


193-504 


95 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


8.7e~09 


32.7 . 


1 


595-635 


97 


ig 


Immunoglobulin domain 


1.8e-20 


71.0 


3 


31-88:127- 
185:222-278 


98 


ig 


Immunoglobulin domain 


1.8e-20 


71.0 


3 


24-81:120- 
178:215-271 


99 


Patched 


Patched family 


6.2e-06 


-369.1 


1 


66-935 


102 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-94 


326.9 


12 


209-231:237- 

259:265- 

287:293- 

315:321- 

343:349- 

371:377- 

399:405- 

427:433- 

455:461- 

483:489- 

511:594-616 
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No: of 

Pfam 

Domains 


Position of 
the Domain 


102 


KRAB 


KRAB box 


3.7e-37 


136.9 


1 


15-77 


103 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-55 


198.2 


9 


172-195:271- 

293:299- 

321:327- 

349:355- 

377:383- 

405:411- 

433:439- 

461:467-489 


103 


KRAB 


KRAB box 


3e-46 


167.1 


1 


8-70 


107 


zf-CCHC 


Zinc knuckle 


2-4e-16 


67.8 


3 


913- 

930:1293- 

1310:1358- 

1375 


107 


NTP_transf2 


Nucleotidyltransferase 
domain 


4.4e-ll 


50.3 


1 


972-1065 


108 


zf~C2H2 


Zinc finger, C2H2 type 


1.6e-42 


154.7 


5 


283289- 
311:317- 
339:345- 
367:373-395 


109 


myosinjiead 


Myosin head (motor domain) 


0 


1267.5 


1 


26-697 


109 


IQ 


IQ calmodulin-binding motif 


lJ2e-17 


72.1 


4 


714-734:737- 

757:760- 

780:789-809 


110 


pkinase 


Protein kinase domain 


1.2e-96 


334.5 


1 


20-271 


111 


WD40 


WD domain, G-beta repeat 


1.8e-49 


177.8 


8 


161-197218- 

253258- 

294:300- 

335:341- 

377:383- 

428:434- 

470:476-511 


112 


SNF2 JN 


SNF2 and others N-terminal 
domain 


4.2e-78 


272.9 


1 


1-264 


112 


helicase_C 


Helicase conserved Cr 
terminal domain 


1.2e-24 


95.4 


1 


326-410 


113 


DUF15 


Domain of unknown function 
DTJF15 


0.00064 


-60.4 


1 ~ 


132-384 


114 


DSPc 


Dual specificity phosphatase, 
catalytic 


0.0004 


-2.9 


1 


141-295 


114 


^-phosphatase 


Protein-tyrosine phosphatase 


0.0037 


-26.9 




128-295 


115 


Ulpl_C 


Ulpl protease family, C- 
tertninal catalytic d 


2.8e-52 


187.1 




394-587 


117 


Rhodanese 


Rhodanese-iike domain 


le-05 


32.4 




160-260 


119 


ABC1 


ABC1 family 


1.7e~40 


147.9 




318-434 


122 


proteasome 


Proteasome A-type and B- 
type 


7.4e-43 


155.8 




39-146 


124 


Ribosomal JL9 


Ribosomal protein L9 


3.1e-05 


-3.4 




94-240 


125 


RIOl 


RIO1/ZK632.3/MJ0444 
family 


7.8e-80 


278.6 




193-387 


128 


abhydrolase 


alpha/beta hydrolase fold 


4.Se-20 


80.1 


1 


121-364 


129 


TPR 


TPR Domain 


4.86-27 


103.3 


7 


355-388:473- 
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fTh 
IV 


Flam Model 


Description 


E-value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 














506:507- 
540:654- 
687:688- 
721:722- 
755:756-789 


1 OA 

130 ' 


HMG14_17 


HMG14 and HMG17 


1.9e-15 


64.7 


1 


2-73 


131 


b23P 


bZlF transcription 


8.3e-19 


71.7 


1 


288-352 


132 


mn 


RNA recognition motif. 


1.9e-31 


117.9 


3 


432-502:546- 
616:858-929 


133 


AMP-binding 


AMP-binding enzyme 


7.1e-117 


401.7 


1 


142-580 


138 


tubulin 


Tubulin/FtsZ family 


2.1e-151 


516.4 


1 


1-223 


141 


lamininJEGF 


Laminin EGF-like (Domains 
mandV) 


7.6e-12 


52.8 


4 


252-297:300- 
348:1342- 
1391:1469- 
1530 


141 


Kelch 


Kelch motif 


1.6e-05 


31.8 


4 


654-702:760- 
811:873- . 
918:929-990 


141 


iutegrinJB 


Integrins, beta chain 


0.0061 


9.4 


3 


44-59:100- 

117:1019- 

1028 


141 


EGF 


EGF-like domain 


0.092 


19.3 


8 


167-203:207- 

235:297- 

331:496- 

533:538- 

569:1271- 

1308:1312- 

1338:1478- 

1508 


142 


RUN 


RUN domain 


8e-44 


159.0 


1 


31-163 


142 


FYVE 


FYVE zinc finger 


2.3e-29 


109.1. 


1 


529-593 


143 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-33 


124.7 


5 


442-464:505- 
527:533- 
555:561- 
583:589-611 


143 


BTB 


BTB/POZ domain 


1.6e-22 


88.2 


1 


30-143 


144 


mito_carr 


Mitochondrial carrier protein 


3.6e^61 


216.6 


3 


10-158:160- 
250:254-354 


146 


DAGKc 


Diacylglycerol kinase 
cauuyuc uomain 


0.00015 


26.0 


1 


157-303 


147 


Exonuclease 


Exonuclease 


L6e-41 


151.4 


1 


228-384 


147 


rnn 


RNA recognition motif. 


9.5e-08 


39.2 


2 


507-574:602- 
674 


151 


WH2 


WH2 motif 


6.5e-20 


79.6 


3 


1194- 
1214:1234- 
1254:1322- 
1342 


154 


DHDPS 


Dihydrodipicolinate 
synthetase family 


9.1e-21 


82.4 


1 


3-270 


156 


PseudoU_synth_l 


tRNA pseudouridine synthase 


le-30 


115.4 


1 


111-322 


157 


pkinase 


Protein kinase domain 


2.3e-59 


210.6 


1 


216-512 


158 


ubiquitin 


Ubiquitin family 


2.4e-05 


24.6 


1 


3-79 
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160 


IF-2B 


Initiation factor 2 subunit 
family 


1.7e-98 


340.7 


1 


157-475 


161 


Beach 


Beige/BEACH domain 


l.le-224 


759.8 


1 


1470-1747 


161 


WD40 


WD domain, G-beta repeat 


2.9e-08 


40.9 


5 


1848- 

1882:1888- 
1928:1947- 
1983:2030- 
2064:2071- 
2107 


164 


DnaJ 


DnaJ domain 


1.9e-16 


68.1 


1 


125-189 


165 


AntLproliferat 


BTG1 family 


7.4e-85 


295.3 


1 


1M64 


166 


sugar_tr 


Sugar (and other) transporter 


L2e-78 


274.7 


1 


34-548 


167 


sugar_tr 


Sugar (and other) transporter 


7e-52 


185.8 


1 


34-480 


168 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-93 


324.0 


13 


222-244:250- 

272:278- 

300:306- 

328:334- 

356:362- 

384:390- 

412:418- 

440:446- 

468:474- 

496:502- 

524:530- 

552:558-580 


168 


KRAB 


KRAB box 


1.8e~35 


131.2 


1 


57-119 


169 


GBP 


Guanylate-binding protein, 
N-terminal domain 


le-191 


636.2 


1 


1-275 


169 


GBP_C 


Guanylate-binding protein, C- 
terminal domain 


6.6e-162 


551.3 


1 


277-573 


170 


cyclin 


Cyclin, N-terminal domain 


0.0022 


9.3 


1 


48-192 


171 


TPR 


TPR Domain 


9.7e-43 


155.4 


6 


133-166:167- 

200:201- 

234:282- 

315:316- 

349:350-383 


173 


RhoGEF 


RhoGEF domain 


3.3e-40 


147.0 


1 


166-345 


173 


PH 


PH domain 


6.5e~l4 


54.5 


1 


378-483 


173 


SH3 


SH3 domain 


l.le-10 


48.9 


1 


72-126 


1/4 




Zinc linger, C3HC4 type 
(RING finger) 


0.00011 


19.4 


1 


18-55 


174 


GBP_C 


Guanylate-binding protein, C- 
tenninal domain 


0.016 


12.1 


1 


86-114 


175 


Peptidase_M22 


Glycoprotease family 


2.3e-73 


257.2 


1 


1-324 


177 


TBC 


TBC domain 


4.7e-08 


10.1 


1 


57-268 


178 


transmembrane4 


Tetraspanin family 


1.6e-78 


259.2 


1 


16-261 


179 


CH 


Calponin homology (CH) 
domain 


1.2e-25 


98.6 


1 


24-133 


179 


calponin 


Calponin family repeat 


L7e-14 


51.8 


1 


174-199 


182 


AP_endonucleasl 


AP endonuclease family 1 


2.6e-17 


59.4 


2 


1-36:50-135 


184 


BacteriaUPQQ 


PQQ enzyme repeat 


9.3e-05 


29.2 


2 


52-89:534- 
571 
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185 


DEAD 


DEAD/DEAH box helicase 


1.6e-60 


1943 


1 


216-420 


185 


helicase_C 


Helicase conserved C- 
terminal domain 


5.9e-25 


96.3 


1 


454-540 


186 


zf-C2H2 


Zinc finger, C2H2 type 


32e-24 


93.9 


6 


106-128:134- 

156:162^ 

184:195- 

218:477- 

499:505-529 


187 


sugarjr 


Sugar (and other) transporter 


0.0014 


-90.1 


1 


272^672 


188 


tRNAJnt_endo 


tRNA intron endonuclease, 
catalytic C-t 


0.0025 


-7.7 


1 


73-159 


189 


wsc 


WSC domain 


le-35 


132.1 


1 


175-254 


189 


Sulfotransfer 


Sulfotransferase protein 


4e-34 


126.8 


1 


356-586 


191 


pkinase 


Protein kinase domain 


5.1e-75 


262.6 


1 


148-421 


191 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


1.3e-05 


32.1 


1 


740-827 


193 


globin 


Globin 


1.9e-26 


96.6 


1 


3-78 


195 


WD40 


WD domain, G-beta repeat 


6.7e-14 


59.6 


4 


64-108:116- 

153:158- 

194:288-323 


197 


BROl 


BROl-iike domain 


0.0042 


-29.4 


1 


9-161 


198 


F_actin_capJB 


F-actin capping protein, beta 
subunit 


1.7e-224 


759.2 


1 


1-269 


199 


ank 


Ank repeat 


le-66 


235.0 


8 


40-73:82- 

114:115- 

147:148- 

180:181- 

212:213- 

246:481- 

526:527-559 


203 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


4.2e-07 


37.0 


1 


211-293 


204 


SAM 


SAM domain (Sterile alpha 
motif) 


1.2e-ll 


52.1 


1 


5-70 


205 


SAM 


SAM domain (Sterile alpha 
motif) 


L2e-ll 


52.1 


1 


5-70 


206 


zf-UBRl 


Putative zinc finger in N- 
recognin 


4.7e-25 


96.7 


1 


978-1046 


207 


ABCjran 


ABC transporter 


2.4e-112 


386.6 


2 


467- 

647:1536- 
1717 


209 


zf-C2H2 


Zinc finger, C2H2 type 


0.00035 


27.3 


1 


200-225 


210 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


1.5e-19 


78.4 


1 


385-454 


211 


IMP4 


Domain of unknown function 


2.2e-33 


124.3 


1 


144-297 


213 


zf-C2H2 


Zinc finger, C2H2 type 


2.9e-08 


40.9 


3 


12-37:173- 
198:208-230 


214 


LysM 


LysM domain 


2.1e-ll 


51.3 


1 


73-116 


215 


ank 


Ank repeat 


l.le-05 


32.3 


2 


834-867:879- 
912 


215 


TIG 


IPTfllG domain 


0.009 


22.6 


1 


642-723 


217 


pyr_jedox 


Pyridine nucleotide- 


1.7e-71 


251.0 


1 


196-470 
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disulphide oxidoreducta 










217 


Riesfce 


Rieske [2Fe~2S] domain 


6.2e-20 


79.6 


1 


68-168 


218 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


8.5e-19 


75.9 


1 


642-728 


219 


pkinase 


Protein kinase domain 


8.1e-67 


235.4 


1 


26-204 


220 


dsrm 


Double-stranded RNA 
binding motif 


0.095 


15 


1 


100-172 


221 


PHD 


PHD-finger 


5.4e-05 


29.6 


1 


147-203 


222 


L27 


L27 domain 


65e-16 


66.3 


1 


13-68 


222 


SAM 


SAM domain (Sterile alpha 
motif) 


7.2e-10 


46.2 


2 


1051- 

1117:1166- 
1230 


223 


TRM 


N2^2-dimethylguanosine 
tRNA methyltransfera 


'7.3e-22 


86.1 


1 


227-693 


224 


UM 


UM domain 


5.3e-06 


33.4 


2 


124-180:183- 
243 


225 


ig 


Immunoglobulin domain 


l.le-07 


29.8 


1 


55-144 


227 


F-box 


F-box domain 


1.3e-05 


32.1 


1 


11-59 


229 


Glucosaminejteo 


Glucosamine-6-phosphate 
isomerases/6- 


2.7e-158 


539.3 


1 


15-250 


231 


PTNJMK 


PTN/MK heparin-binding 
protein family 


3.6e-44 


160.2 


1 


51-148 


236 


ion_trans 


Ion transport protein 


1.6e-22 


88.3 


1 


174-393 


238 


GNS1JSUR4 


GNS1/SUR4 family 


5.2e-46 


166.3 


1 


10-265 


240 


ubiquitin 


Ubiquitin family 


2.7e-05 


24.4 


1 


10-89 


241 


PIP5K 


Phosphatidylinositol-4- 
phosphate 5-Kinase 


1.5e-155 


530.2 


1 


124-420 


242 


cadherin 

* 


Cadherin domain 


0 


1298.9 


19 


1-75:89- 

180:194- 

290:355- 

434:448- 

549:563- 

652:671- 

774:788- 

881:896- 

988:1002- 

1092:1106- 

1192:1206- 

1295:1309- 

1379:1393- 

1489:1503- 

1594:1608- 

1699:1713- 

1808:1814- 

1910:1922- 

2016 


244 


m3 


Fibronectin type m domain 


1.2e-31 


118.6 


4 


58-140:152- 

238:249- 

333:345-426 


245 


UQ_con 


Ubiquitin-conjugating 
enzyme 


1.4e-16 


68.5 


1 


93-250 


246 


LRR 


Leucine Rich Repeat 


L7e-14 


61.6 


6 


51-75:76- 
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99:155- 
178:181- 
203:204- 
226:227-251 


247 


lipocalin 


Lipocalin / cytosolic fatty- 
acid bindinc or 


1.2e-28 


102.8 


1 


164-294 


248 


Ribosomal_S2 


Ribosomal protein S2 


2.9e-ll 


43.7 


1 


33-80 


249 

Art? 


tubulin 




8 5e-163 




1 


1-277 


250 


UXl/UIUI 




9 4e-919 


718 8 


i 

X 


1-351 


251 


ATP-synt_ab 


ATP synthase alpha/beta 

JLaimijt iiuvioui 


1.2e-75 


264.8 


1 


138-346 


251 


ATP-synl_ab_C 


ATP synthase alpha/beta 


2.7e-38 


140.6 


1 


348-456 


251 


ATP-synt_ab_N 


ATP synthase alpha/beta 

La.UU.Ly j UvU'lMl 


5.4e-19 


76.5 


1 


67-135 


959 


t\ X i -ojr Ul__au 


A'i'l-* cvnthac^ nlr»hj»/h#»t» 
Air ojuuiaow ui\JiLaj ucia. 

family, nucieot 


X..JC-/V 




1 
X 


lJOVfT 


959 


ATP-cvnf qK "NT 


ATD cvntHacp nlnlia/ltAta 

Air oynuiabc aipna/ucui 

family, beta-ba 


5 Af>_10 


7fi 5 


1 

X 


V/-1JJ 


953 


Zl-^Oxx\»>f 


Z/ixic ungcr, ^oxiv^r iype 
(RING finger) 


5^-19 


43 9 


1 

X 


30-70 


95A 


VJ-palCJl 


VJ-palWl UUHIain 




49. 1 
*ti»»i 


i 

1 


410-456 


955 




fnlnnntn hnmn1r\crv ( C^XW 

domain 


l.\JC*l X 


51 7 


1 

X 


24-134 


256 


RF-1 


Pepudyl-tRNA hydrolase 
domain 


5.9e-66 


232.5 


1 


225-338 


957 


UTJ-1 
JxT-I 


Pf*r*HH vl-tPM A nvHrrilae** 
i CpiiUji-UVL^I/Y UjUrUiooC 

domain 




939 5 


1 

X 


1 89-309 


95R 




\J x u-iiKc cysteine pruicitbc 


*t.*tC"10 


73 5 


1 

X 


1 89-304 


950 




'Phi rwv»/4 / w i n 




35 7 


0 


1 19-1 65 »662- 

X X -7 X WJ •W** 

695 


260 


thyroglobuliiul 


Thyroglobulin type-1 repeat 


3.1e-34 


127.2 


2 


95-158:227- 
292 


260 


mLUU 


JCVflf jr^Jw ouuiw piwldxdw 

inhibitor 


9.3e-07 


35 9 


I 


43-87 


262 


DnaJ 


DnaJ domain 


4.1e-15 


63.6 


1 


277-338 




YV A«J*HJ 


WH domain fl-Hpta rpnAat 
w xj uummi) vj-ucia repeal 

* 




83 6 


< 

•J 


3-42*49- 
86*97- 
133:142- 
178:184-220 


265 


DUF6 


Integral membrane protein 
DUF6 


0.083 


9.1 


2 


81-316:338- 
470 


266 


RibosomaLL3 le 


Ribosomal protein L31e 


L7e-61 


217.7 


1 


15-109 


26S 


F5JF8_type_C 


F5/8 type C domain 


2.4e-65 


230.5 


1 


42-196 


268 


Zn_caibOpept 


Zinc carboxypepudase 


3.5e-50 


180.1 


2 


224-341:400- 
600 


270 


BTB 


BTB/POZ domain 


7.7e-18 


72.7 


1 


8-119 


270 


zf-C2H2 


Zinc finger, C2H2 type 


4.2e-13 


57.0 


4 


254-276:363- 

385:390- 

412:448-468 


271 


Glycos transLl 


Glycosyl transferases group 1 


0.027 


12.8 


1 


291-385 


272 


HEAT 


HEAT repeat 


2.2e-07 


38.0 


3 


237-275:276- 
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315:674-712 


273 


HEAT 


HEAT repeat 


2.2e-07 


38.0 


3 


237-275:276- 
315:640-678 


275 


SPRY 


SPRY domain 


2.6e-34 


127.4 


1 


390-515 


275 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


le-16 


583 


1 


29-69 


277 


BTB 


BTB/POZ domain 


6e-27 


103.0 


1 


36-149 


277 


Kelch 


Kelch motif 


9.7e-21 


82.3 


4 


331-390:392- 

441:443- 

493:540-586 


278 


ZI-C2H2 


Zinc finger, C2H2 type 


4.1e-llo 


399.2 


14 


193-215:221- 

243:249- 

271:277- 

299:305- 

327:333- 

355:361- 

383:389- 

4ii:4i /- 

439:445- 

HO / .*r/ J- 

495:501- 
523.529- 
551:557-579 


229 


SCAN 


SCAN domain 


2.4e-52 


187.3 


1 


36-132 


229 


zf-C2H2 


Zinc finger, C2H2 type 


2.4e-5l 


184.0 


7 


348-370:375- 

397:403- 

425:431- 

453:459- 

480:486- 

508:514-537 


Oil 

231 


Zip 


Zir Zinc transporter 


o.oe-2U 


/y.o 


1 


1 1 A A 
1-140 




INir — transt^JZ 


1- J -1 j J- 

JN ucleo uay l trans terase 
domain 


ooe-13 




i 


0/-174 


286 


zf-C2H2 


Zinc finger, C2H2 type 


2.8e-93 


323.3 


12 


118-140:146- 

168:174- 

196:202- 

224:230- 

252:258- 

280*286- 

308:314- 

336:342- 

364:370- 

392:398- 

420:426-448 


286 


KRAB 


KRAB box 


3.6e-38 


140.2 


1 


8-70 


287 


zf-C2H2 


Zinc finger, C2H2 type 


5.3e-124 


425.4 


17 


183-205:211- 

233:239- 

261:267- 

289:295- 

317:323- 

345:351- 

373:379- 
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401:407- 
429:435- 
457:463- 
485:491- 
513:519- 
541:547- 
569:575- 
597:603- 
625:631-653 


289 


DiHfoiate_red 


Dihydrofolate reductase j 


7.4e-77 


268.8 


1 


4-185 


291 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


7.4e-17 


69.4 


1 


5-84 


293 


PH 


PH domain 


1.4e-08 


355 


1 


44-147 


294 


adh_short 


short chain dehydrogenase 


3.9e-29 


110.2 


1 


36-284 


297 


PKD 


PKD domain 


9.9e-09 


42,4 


2 


663-753:756- 
839 


297 


BNR 


BNR repeat 


3.2e-06 


34.1 


5 


115-126:156- 
167:351- 
362:428- 
439:470-481 


300 


HMG_box 


HMG (high mobility group) 
box 


5.4e-05 


20.0 


1 


245-304 


301 


ig 


Immunoglobulin domain 


0.05 


11.6 


1 


629-688 


302 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5e-12 


43.2 


1 


39-79 


303 


START 


START domain 


0.015 


4.1 


1 


1790-1994 


304 


integrase 


Integrase DNA binding 
domain 


7.2e-06 


32.9 


1 


51-96 


305 


myosinjiead 


Myosin head (motor domain) 


7.6e-279 


939.7 


2 


11-668:689- 
733 


306 


zf-C2H2 


Zinc finger, C2H2 type 


8.5e-54 


192.1 


7 


66-88:94- 

116:122- 

144:150- 

172:178- 

200:280- 

303:317-339 


307 


ig 


Immunoglobulin domain 


0.00023 


19.1 


2 


35-104:136- 
194 


309 


ras 


Ras family 


0.00079 


-93.3 


1 


38-176 


310 


ig 


Immunoglobulin domain 


2.1e-06 


25.7 


1 


37-112 


311 


EF1BD 


EF~1 guanine nucleotide 
exchange domain 


4.7e-56 


199.6 


1 


139-225 


312 


BTB 


BTB/POZ domain 


8.4e-25 


95.8 


1 


51-164 


313 


zf-C2H2 


Zinc finger, C2H2 type 


7.7e-59 


208.9 


9 


118-140:197- 

219:281- 

303:309- 

331:337- 

359:365- 

387:393- 

415:421- 

443:449-471 


313 


KRAB 


KRAB box 


1.4e-17 


71.8 


1 


41-99 
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314 


Hydrolase 


haloacid dehaiogenase-iike 
hydrolase 


0.045 


8.2 


1 


213-671 


315 


cNMP Jrinding 


Cyclic nucieotide-binding 
domain 


4e-26 


100.2 


1 


387-475 


315 


ioujrans 


Ion transport protein 


3.8e-19 


77.0 


1 


69-290 


316 


PeptidaseJS26 


Signal peptidase I 


2.8e-16 


563 


2 


38-98:117- 
139 


317 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-56 


199.8 


9 


156-178:184- 

206:212- 

234:240- 

262:268- 

290:296- 

318:324- 

346:352- 

374378^00 


317 


KRAB 


KRAB box 


6.7e-16 


66.3 


1 


11-73 


319 


UPF0073 


Uncharacterised protein 
family 


1.8e-09 


27.9 


1 


33-276 


320 


EGF 


EGF-like domain 


4.7e-08 


40.2 


1 


26-59 


321 


Iectin_c 


Lectin C-type domain 


8.6e-15 


62.6 


1 


268-374 


325 


MAM 


MAM domain 


1.3e-52 


188^2 


1 


338-503 


325 




Immunoglobulin domain 


1.9e-15 


54.8 


3 


41-101:138- 
202:346-420 


327 


MAM 


MAM domain 


5.3e-180 


611.4 


4 


26-169:170- 

329:342- 

498:509-666 


328 


Sema 


Sema domain 


1.5e~211 


716.2 


1 


56-491 


329 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-84 


294.3 


13 


170-192:198- 

220:226- 

248:254- 

276:282- 

304310- 

332:338- 

360:366- 

388:394- 

416:422- 

444:450- 

472:478- 

500:506-528 


331 


PAP2 


PAP2 superfamily 


8e-22 


85.9 


1 


160-314 


332 


LRR 


Leucine Rich Repeat 


3.4e-36 


133.7 


11 


58-81:82- 

105:106- 

129:130- 

153:154- 

177:178- 

201:202- 

225:250- 

273:274- 

297:298- 

321:322-345 


332 




Immunoglobulin domain 


2.5e-08 


31.9 


1 


425-485 


332 


LRRNT 


Leucine rich repeat N- 


2.5e-05 


31.1 


1 


27-56 
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terminal domain 










332 


LRRCT 


Leucine rich repeat C- 
tenninal domain 


0.0029 


24.3 


1 


355-408 


333 


AdoHcyase 


S-adenosyl-L-homocysteine 
hydrolase 


1.5e-280 


945.4 


1 


214-640 


334 


TBC 


TBC domain 


9.4e-38 


138.9 




89-302 


341 


WD40 


WD domain, G-beta repeat 


0.00094 


25.9 


\ 


2-32:109-146 


342 


ABC1 


ABC1 family 


0.051 


-29.9 


1 


3-50 


344 


globin 


Globin 


3e-45 


162.2 


1 


1-141 


345 


globin 


Globin 


7.5e-39 


139.9 




1-31:68-179 


347 


F-box 


F-box domain 


1.5e-07 


38.5 


1 


24-72 


348 


HLH 


Helix-loop-helix DNA- 
binding domain 


2e-08 


41.4 


1 


83-137 


349 


KRAB 


KRAB box 


2.7e-39 


144.0 




4-66 


350 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


L7e-19 


78.2 


\ 


645-705 


350 


UCH-1 


Ubiquitin carboxyl-terminal 
hydrolases f amil 


9.1e-15 


62.5 


_ 


363-394 


350 


zf-UBP 


Zn-finger in ubiquitin- 
hydrolases and other 


0.00069 


18.9 


1 


236-306 


351 


NUDIX 


MutT-like domain 


8.2e-12 


52.7 


— — - 


50-200 


352 


IBR 


IBR domain 


1.6e-12 


55.0 




101-166 


353 


BR 


IBR domain 


1.6e-12 


55.0 


\ 


66-131 


354 


SCP 


SCP-like extracellular protein 


1.4e-34 


128.3 




56-208 


356 


mito_carr 


Mitochondrial carrier protein 


9.7e-78 


271.7 




10-125:127- 
220:232-321 


358 


UCH-1 ' 


Ubiquitin carboxyl-terminal 
hydrolases f amil 


5.1e-15 


63.3 




323-354 


358 


zf-UBP 


Zn-finger in ubiquitin- 
hydrolases and other 


0.00049 


19.4 




195-264 


360 


Phage Jysozyme 


Phage lysozyme 


0.0014 


23.4 




94-184 


362 


RibosomaLS2 


Ribosomal protein S2 


3.3e~08 


32.9 




20-62 


364 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5.3e-09 


33.4 


• 


291-329 


365 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.0096 


13.1 




109-148 


367 


TPR 


TPR Domain 

XX XV 1/V/UlflUI 


0.043 


20.4 




1-28 


370 


zf-C2H2 


Zinc finerer C2H2tvne 


53e-109 


375.5 


14 


127-149:155- 

177:183- 

205:211- 

233:239- 

261:267- 

289:295- 

317:323- 

345:351- 

373:379- 

401:407- 

429:435- 

457:463- 

485:491-513 


370 


SCAN 


SCAN domain 


4.2e-38 


140.0 


1 


27-122 


371 


arf 


ADP-ribosyiation factor 


4.9e-39 


143.1 


1 


6-184 
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Table 4 



SEQ 


Pf am Model 


Description 


E-value 


Score 


No: of 


Position of 


ID 








Pfam 


the Domain 






• 






Domains 








family 










371 


ras 


Ras family 


7.2e-06 


-70.1 


1 


22-186 


372 


BNR 


BNR repeat 


0.031 


20.9 


3 


171-182:244- 












255:295-306 


373 


zf-C2H2 


Zinc finger, C2H2 type 


8.3e-25 


95.8 


5 


142-162:171- 




- » 










198:204- 














228:234- 














258:264-288 


376 


rnn 


RNA recognition motif. 


0.00019 


28.2 


1 


112-163 


377 


rrm 


RNA recognition motif. 


2.2e-19 


77.9 


1 


112-183 


380 


vwc 


von Willebrand factor type C 


1.6e-31 


118.2 


3 


22-76:79- 






domain 








134:137-192 


381 


RibosomaLL35Ae 


Ribosomal protein L35 Ae 


0.00013 


7.0 


1 


1-79 


385 


ras 


Ras family 


3.9e-63 


223.2 


1 


35-229 


385 


arf 


ADP-ribosylation factor 


1.7e45 


-46.9 


1 


18-202 






family 










388 


F-box 


F-box domain 


1.5e-05 


31.9 


2 


23-70:99-146 


390 


SPRY 


SPRY domain 


6.2e-10 


46.4 


1 


101-239 


391 


tRNA^Me^trans 


tRNA methyl transferase 


L9e-19 


50.9 


1 


5-185 


392 


zf-C2H2 


Zinc finger, C2H2 type 


4e-17 


70.3 


3 


175-197:203- 














225:231-253 


393 


SCAN 


SCAN domain 


3.1e-39 


143.8 


1 


389-484 


393 


SPRY 


SPRY domain 


1.8e-19 


78.1 


1 


148-273 


393 


zf-C2H2 


Zinc finger, C2H2 type 


4e-09 


43.7 


2 


759-781:787- 












809 


393 


zf-C3HC4 


Zinc ringer, C3HC4 type 


0.0032 


14,7 


1 


11-52 






(RING finger) 










394 


Kelch 


Kelch motif 


4e-53 


189.9 


5 


329-375:377-* 














431:433- 














479:481- 














525:527-572 


394 


BTB 


BTB/POZ domain 


6.1e-26 


99.6 


1 


30-144 


395 


C2 


C2 domain 


2.2e-80 


280.4 


2 


159-251:296- 














384 


396 


auk 


Ank repeat 


5.6e-33 


123.0 


4 


47-79:80- 














112:140- 














174:175-207 


396 


PH 


PH domain 


8.9e-05 


22.0 


1 


236-334 


397 


ank 


Ank repeat 


1.7e-26 


101.4 


4 


17-49:50- 














82:83- 














115:116-148 


398 


Nucleoplasmin 


Nucleoplasmin 


3.6e-29 


110.4 


1 


13-209 


400 


DAGKa 


Diacylglycerol kinase 


1.9e~124 


426.8 


1 


598-778 






accessory domain 










400 


DAGKc 


Diacylglycerol kinase 


7.1e-67 


235.6 


1 


454-578 






catalytic domain 










400 


DAGJ>E4rind 


Phorbol esters/diacylglycerol 


2.9e-23 


90.7 


2 


261-310:326- 






binding dom 








374 


400 


efhand 


EFhand 


2.4e-12 


54.4 


2 


169-197:214- 














242 


403 


PDZ 


PDZ domain (Also known as 


7.7e-46 


165.7 


3 


86-166-^10- 
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table 4 



SEQ 

n> 


Pf am Model 


Description 


E- value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 






DHRorGLGF) 








291:821-907 


404 


zf-C2H2 


Zinc finger, C2H2 type 


2.6e-48 


173.9 


7 


172-194:200- 

222:228- 

250:256- 

278:284- 

306:312- 

331:340-362 


405 


K_tetra 


K+ channel tetramerisation 
domain 


2.6e-23 


90.9 


1 


51-146 


406 


SNF 


Sodium: neurotransmitter 
symporter family 


0 


1268.7 


1 


60-657 


407 


>S 


Immunoglobulin domain 


l.le-06 


26.5 


1 


53-120 


408 


DnaJ 


DnaJ domain 


2.3e-27 


104.3 


1 


4-68 


408 


DaaJ_C 


DnaJ C terminal reeion 


3.1e-08 


38.1 


1 


192-314 


409 


mito_canr 


Mitochondrial carrier protein 


1.4e-57 


204.7 


3 


5-100:102- 
201-205-302 


410 


zf-C2H2 


Zinc fineer C2H2tvne 


5.2t>97 " 


335.7 


12 


141-163*169- 

191:197- 

219:225- 

247:253- 

275:281- 

303:309- 

331:337- 

359:365- 

387:393- 

415:421- 

443:449-473 


411 


S_100 


S-100/ICaBP type calcium 
binding domain 


9.7e-13 


55.8 


1 


5-48 


411 


efhand 


EFhand 


0.0012 


25.6 


1 


54-82 


413 


fo3 


Fibronectin type m domain 


8.6e-14 


59.3 


2 


22-107:119- 












196 


413 


PHD 


PHD-finger 


9.6e-05 


272 


1 


285-341 


414 


zf-C2H2 


Zinc finger, C2H2 type 

• 


2.3e-27 


104.4 


6 


42-64:117- 

139:145- 

167:173- 

196:534- 

556:573-595 


415 


SPRY 


SPRY domain 


3.9e-18 


73.7 


1 


347-467 


415 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


4.4e-14 


49.9 


1 


16-56 


415 


zf-B_box 


B-box zinc finger 


9e-07 


35.9 


1 


92-133 


416 


pkinase 


Protein kinase domain 


1.2e-54 


195.0 


1 


97-317 


417 


trypsin 


Trypsin 


4.6e-38 


1225 


1 


41-234 


418 


Glypican 


Glypican 


5.7e-131 


448.5 


1 


3-244 


419 


Keratin JB2 


Keratin, high sulfur B2 
protein 


0.0013 


-23.4 


1 


37-159 


420 


Dyneinjieavy 


Dynein heavy chain 


0 


1432.3 


1 


309-1019 


421 


zf-C2H2 


Zinc finger, C2H2 type 


0.00039 


27.2 


3 


75-99:203- 
227:266-290 


422 


*g 


Immunoglobulin domain 


0.00074 


175 


1 


34-107 


423 


&3 


Fibronectin type m domain 


6e-08 


39.8 


1 


443-531 
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Table 4 



SEQ 
ID 


Pfam Model 


Description 


E- value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 


424 


KeratinJ32 


Keratin, high sulfur B2 
protein 


0.0023 


-27.1 


2 


5-150:152- 
251 


425 


pkinase 


Protein kinase domain 


2.3e-55 


197.3 


1 


69-390 


426 


ig 


Immunoglobulin domain 


4.1e-09 


34.4 


1 


35-112 


427 


GalactosyLT 


Galactosyltransferase 


2.6e-35 


130.8 


1 


158-349 


428 


proteasome 


Proteasome A-type and B- 
type 


5.5e-28 


106.4 


1 


96-238 


429 


7tmJ 


7 transmembrane receptor 
(rhodopsin family) 


3.4e-38 


1233 


1 


41-290 


430 


BTB 


BTB/POZ domain 






1 

L 


JO~l to 


430 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-07 


37.0 


2 


472-494:500- 


433 


p450 




U.*rC-l ID 




1 
X 




434 


sugar Jr 


Sugar (and other) transporter 


2.6e-64 


227.1 


1 


10-512 * 


435 


zf-C2H2 


Zinc finger, C2H2 type 


1.8e-52 


187.8 


9 


287-309:315- 

337:546- 

568:574- 

596:606- 

628:844- 

866:872- 

894:980- 

1002:1008- 

1030 


436 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


2.2e-40 


130.4 


2 


82-221:229- 
284 


437 


FGF 


Fibroblast growth factor 


4.6e-14 


51.6 


1 


48-129 


438 


Osteopontin 


Osteopontin 


3.7e-181 


615.2 


1 


1-294 
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1^ — 










pi 












PDB annotation 


TRANSCRIPTION 
REGULATION PROTO- 
ONCOGENE. NUCLEAR 
BODIES (PODS), LEUKEML 
2 TRANSCRIPTION 


TRANSFERASE HRS; HRS, 
VHS, FYVE, ZINC FINGER, 
SUPERHELIX 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER 
(C3HC4) 




COMPLEX 

(ISOMERASE/DIPEPTIDE) 
PINl; PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE, 
ROTAMASE, 2 COMPLEX 
I (ISOMERASE/DIPEPTIDE) 
CONBCT 




COMPLEX 


(APOPTOSIS/PEPTIDE) 
APOPTOSIS, ALTERNATIVE 
SPLICING, COMPLEX 
(APOPTOSIS/PEPTtDEl 


APOPTOSIS HELICAL 
PROTEIN 


APOPTOSIS APOPTOSIS 
REGULATOR BCL-X; 
APOPTOSIS, PROGRAMMEI 
CELL DEATH, BCL-2 
FAMILY 




STRUCTURAL PROTEIN 


Compound 


TRANSCRIPTION FACTOR 
PML; CHAIN: NULL; 


HEPATOCYTE GROWTH 
FACTOR-REGULATED 
TYROSINE CHAIN: A; 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTORMATl; CHAIN: A; 




PEPTIDYL-PROLYL CIS- 
TRANS ISOMERASE; 
CHAIN: A; ALA-PRO 
DIPEPTIDE; CHAIN: B; 




BCL-XL; CHAIN: A; BAK 


PEPTIDE; CHAIN: B; 


APOPTOSIS REGULATOR 
BAX, MEMBRANE 
ISOFORM ALPHA; CHAIN: 
A; 


BCL-XL; CHAIN: NULL; 




ALPHA SPECTRIN; CHAIN: 


SEQFOLD 
score 






















81.85 ! 


PMF 
score 


1-H • 


is 

d • 


a 

d 




i 




1-H 

§ 


o 
© 


s. 

d 






Verify 
score 


OO 

m 

o 
■ 




-0.44 




9 




9 


CO 

9" 


© 

9 






Psi 
Blast 




0.0061 


vo 

? 

CO 
rH 




S 

00 




00 

■s 

1-H 

io 


VO 


o 

i 








s 




s 




vo 
oo 




<M 
VO 
cn 


8 

co 


m 






START 
AA * 


o 

1-H 




o 

r-f 




oo i 




**5 
cn 


oo 
o 
cn 


cn 

CN 
CN 






CHAIN 
ID 




«< 


< 




< 




< 


<: 










o- 

1—1 


f 

*-H 






a 
a* 






VO 

C 

f—t 


lmaz 






SEQID 
NO: 




»-H 


f— I 




vo 




*-h 


«-H 

T-H 


1-H 
•-H 




CN 
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PDB annotation 


TWO REPEATS OF 
SPECTRIN, ALPHA HELICAL 
LINKER REGION, 2 2 
TANDEM 3-HELIX COILED- 
COILS, STRUCTURAL 
PROTEIN 


glsi 




RIBOSOME SOS RIBOSOMAL 
PROTEIN L2P, HMAL2, HU; 
50S RIBOSOMAL PROTEIN 
L3P,HMAL3,HL1;50S 
RIBOSOMAL PROTEIN L4E, 
HMAL4, HL6; 50S 
RIBOSOMAL PROTEIN L5P, 
HMALS,HL13;30S 
RIBOSOMAL PROTEIN HS6; 
SOS RIBOSOMAL PROTEIN 
L13P,HMALI3;50S 
RIBOSOMAL PROTEIN L14P, 
HMAL14 HL27- SOS *f 
RIBOSOMAL PROTEIN L15P,f 
HMAL15,HL9;50S 1 
RIBOSOMAL PROTEIN L18Pr , 
HMAL18,HL12;50S "ffl 
RIBOSOMAL PROTEIN L18E& 1 
HL29, LI9; SOS RIBOSOMAL 
PROTEIN L19E, HMALJ9, t 
HL24;50S RIBOSOMAL R 
PROTEINL21E,HL31;50S I 
RIBOSOMAL PROTEIN L22P " 
HMAL22,HL23;50S £ 
RIBOSOMAL PROTEIN L23P.IH 
HMAL23,HL25,L2l;50S R 
RIBOSOMAL PROTEIN L24P.fl | 
HMAL24, HL16, HL15; SOS n 


i 

! 


U 
< 


HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 




23SRRNA; CHAIN: 0;5S 
RRNA; CHAIN: 9; 
RIBOSOMAL PROTEIN L2; 
CHAIN: A; RIBOSOMAL 
PROTEIN L3; CHAIN: B; 
RIBOSOMAL PROTEIN L4; 
CHAIN: C; RIBOSOMAL 
PROTEIN L5; CHAIN: D; 
RIBOSOMAL PROTEIN 
L7AE; CHAIN: E; 
RIBOSOMAL PROTEIN 

LlOE* PHATN' F» 
RIBOSOMAL PROTEIN 
L13; CHAIN: G; 
RIBOSOMAL PROTEIN 
L14; CHAIN: H; 
RIBOSOMAL PROTEIN 
L15E;CHAIN:I; 
RIBOSOMAL PROTEIN 
LIS; CHAIN: J; 


L18; CHAIN: K; 
RIBOSOMAL PROTEIN 
L18E; CHAIN: L; 
RIBOSOMAL PROTEIN 
L19; CHAIN: M; 
RIBOSOMAL PROTEIN 
L21E; CHAIN: N; 


SEQFOLD 
score 




72.83 






PMF 
score 








0.00 


Verify 
score 








o 

OS 

«? 


Psi 
Blast 




6 
en 




3 












START 
AA 










|a 




< 




ID 


la 




lquu 




* 


SEQID 
NO: 




r* 
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a 

e 

I 

O 

a 

a 
« 

PQ 

& 




gipl 




FTSEJH7 UiHigtdt 




! 
t 




OS {fj 04 >T OA 
K v. Pi 7. G« 



5 5 

I .J 

g £ b 



1 



a 

i 



onono:.oOQuono , 

OpqO«0<Od50wOwO' 





<2 -9 <2 



ouono 

co co 71 co 

p^ogo 
D 2 3 



•C B 



PQ 



11 



CO 



a 



33 
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PDB annotation 




LIGASE CBL, UBCH7, ZAP- 
70, E2, UBIQUITIN, E3, 1 
PHOSPHORYLATION, 2 1 
TYROSINE KINASE, 4 


i 

38 

a < 

li 


• 

■ 

1 

i 

i 


LIGASE CBL, UBCH7, ZAP- 
70, E2, UBIQUITIN, E3, 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 


1 

P 


C 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl ; RING FINGER 
(C3HC4) 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER % 
(C3HC4) . R 


2 ^ O 5Z *C ^ S 


DNA-BINDING PROTEIN ^ 
V(D) J RECOMBINATION 
ACTIVATING PROTEIN 1; W 
RAGl, V(D)J f|j 
RECOMBINATION, (TfJ 


Compound 


VIRUS-1 (C3HC4, OR RING 
DOMAIN) 1CHG 3 (NMR, 1 


STRUCTURE) 1CHC 4 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUITIN- 
CONJUGATING ENZYME 
E12-18 KDA UBCH7; 
CHAIN: C; 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUTTIN- 
CONJUGATING ENZYME 
E12-18 KDA UBCH7; 
CHAIN: C; 


CDK-ACnVATTNG 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 


CDK-ACTTVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 


| RAGl; CHAIN: NULL; 


»— 


<• 
i 

; 


h 














60.57 


p. 


score 




0.33 


0.93 


s 


0.51 


1.00 




fi 

> * 




0.12 


0.73 


s 
=> 


-0.26 


0.25 




2 


Blast 




cn 
CN 


O < 

cn < 


■n 




o 

cn 


3.4e-20 






i—i 

00 


v© < 
r- i 




a 


CN 


•-4 

CN 


START 
AA 




cn 


23 < 

CO C 


n 


cn 
cn 




OS 


1 CHAIN 


e 




<' 


< 




< 










1 


Ifbv 


S> 






1 




1 SEOID 


1 NO: 




m 


cn 


3 


cn 


CN 

cn 


CN 
cn 
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s 

00 
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S 



CO 



m 
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PDB annotation 

BINDING/EFFECTOR), G 
PROTEIN, EFFECTOR, 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS.RAB 
PROTEIN, RAB3 A, 


HYDROLASE G PROTEIN, M 
VESICULAR TRAFFICKING, " ' 
GTP HYDROLYSIS, RAB 2 
PROTEIN, 


NEUROTRANSMITTER 
RELEASE, HYDROLASE 


TRANSCRIPTION 
REGULATION SIGMA70; 
RNA POLYMERASE SIGMA 
FACTOR, TRANSCRIPTION 
REGULATION 


COMPLEX (BLOOD 
COAGULATION/INHIBITOR) 
AUTOPROTHROMBIN HA; 
HYDROLASE, SERINE m 
PROTEINASE), PLASMA v 
CALCIUM BINDING, 2 f? 
GLYCOPROTEIN, COMPLEX^ 
(BLOOD S 
COAGULATION/lNHIBITORIrll 


ANTI-COAGULANT ANTI- J* 
COAGULANT, PEPTIDIC S 
INHIBITORS, y 
CONFORMATIONAL 2 FW 
FLEXIBILITY, SERINE " \ 


SUGAR BINDING PROTEIN y 
UDA; LECTIN, HEVEIN h 
DOMAIN, UDA, jj 
SUPERANTIGEN JJ 
SUGAR BINDING PROTEIN it 


Compound 


RAB3A; CHAIN: A; 




RNA POLYMERASE 
PRIMARY SIGMA 
FACTOR; CHAIN: NULL; 


ACTIVATED PROTEIN C; 
CHAIN: C, L; D-PHE-PRO- 
MAI; CHAIN: P; 


HIRUSTASIN; CHAIN: ! 
NUUL; 


AGGLUTININ ISOLECTIN 
WAGGLUTIMN 
ISOLECTIN V; CHAIN: A; 

AGGLUTININ ISOLECTIN 


SEQ FOLD 
score 


oo 

ON 

vo 
1— • 


79.28 


50.24 






PMF 
score 








T-H 

9 


0.65 
0.09 


u 








0.03 


3 2 
© 9 


Psi 
Blast 


00 

VO 




! 

vo 
cs 


3 


g S 
^ & 


8 < 
a < 


00 
ON 
i— 1 


3 


cn 


8 


* a 


START 
AA 




ON 
i— t 
i—l 


oo 


m 


1—1 —4 


CHAIN 
ID 


< 




>J 




< < 


ia 


1 

rn 


lsig 


laut 


£ 

i-H 


leis 
leis 


SEQ ID 
NO: 








5 
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PDB annotation 


UDA; LECTIN, HEVEIN j 
DOMAIN, UDA, 
SUPERANTIGEN 


SUGAR BINDING PROTEIN 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 

SUPERANTIGEN M 


SUGAR BINDING PROTEIN « 1 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 
SUPERANTIGEN, 
SACCHARIDE BINDING 


SUGAR BINDING PROTEIN 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 
SUPERANTIGEN, 
i SACCHARIDE BINDING 


SUGAR BINDING PROTEIN | 
UDA; LECTIN, HEVEIN 
DOMAIN, UDA, 
SUPERANTIGEN, 
SACCHARIDE BINDING 


GLYCOPROTEIN 
GLYCOPROTEIN *Q 


ctJ6 


§Pi£|fl § 


Compound 


VI/AGGLUTININ 
ISOLECTIN V; CHAIN: A; 


AGGLUTININ ISOLECTIN 
VI/AGGLUTMN 
ISOLECTIN V; CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
| V/ CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


AGGLUTININ ISOLECTIN 
I/AGGLUTININ ISOLECTIN 
V/ CHAIN: A; 


LAMININ; CHAIN: NULL; 


AGGREGATION 
INHIBITOR, GP 
ANTAGONIST KISTRIN 
fNMR. 8 STRUCTURES^ 


IKST 3 


FACTOR DCA; CHAIN: Q 
L,;D-PHE-PRO-ARG; 
CHAIN: I; 


SEQFOLD 
score 












70.40 




m 
*0 
o> 

**o . 


PMF 
score 




0.41 


0.62 

• 


0.11 


0.03 




0.00 




Verify 
score 




-0.10 


-0.12 


-0.71 


-0.05 




-0.45 
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PDB annotation 


ADHESION PROTEIN, 
TRANSMEMBRANE, 2 
GLYCOPROTEIN 




GLYCOPROTEIN 
GLYCOPROTEIN 


SERINE PROTEASE FVHA; 
FVHA; BLOOD 
COAGULATION, SERINE 
PROTEASE 


METAL BINDING PROTEIN 
BETA SANDWICH, 
CALCIUM-BINDING 
PROTEIN, METAL BINDING 
2 PROTEIN 


MEMBRANE ADHESION T J 


SHORT CONSENSUS ft 
REPEAT, SUSHI, J 
COMPLEMENT CONTROL \l 
PROTEIN, 2 N- JD 
OLYCOSYLATION, MULTI- 73 
DOMAIN, MEMBRANE *jj 
ADHESION Q 


MEMBRANE ADHESION il 
SHORT CONSENSUS V 

REPEAT, susm, e 
COMPLEMENT CONTROL y 
PROTEIN, 2 N- L 
OLYCOSYLATION, MULU- >* 
DOMAIN, MEMBRANE R 
ADHESION fl 


Compound 




GLYCOPROTEIN FACTOR 
H, 15TH C-MODULE PAIR 
(NMR, MINIMIZED 
AVERAGED 1HFIA 1 
STRUCTURE) 1HFI4 1HFIA 
5 


LAMININ; CHAIN: NULL; 


COAGULATION FACTOR 
VDA (LIGHT CHAIN); 
CHAIN: L; COAGULATION 
FACTOR VUA (HEAVY 
CHAIN); CHAIN: H; 


I 

P ^ 

P 


LAMININ ALPHA2 CHAIN; 
CHAIN: A, B, CD; 


HUMAN BETA2- 
GLYCOPROTEIN I; CHAIN: 
A; 


HUMAN BETA2- 
GLYCOPROTB1N I; CHAIN: 
A; 
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a 

| 

! 


SEVEN-STRANDED 
INCOMPLETE 
ANTIPARALLEL UP-AND- 
DOWN BETA 2 BARREL, 
ACTIN-BINDING PROTEIN, 
POLY-L-PROUNE BINDING 
3 PROTEIN, PIP2 BINDING d 
PROTEIN ^ 


PROTEIN BINDING H 
ACETYLATION, ACTIN- 1 
BINDING PROTEIN, 
MULTIGENE FAMILY 


ALLERGEN ALLERGY, 


ACTIN-BINDING PROTEIN 




ACTIN-BINDING PROTEIN 
ACTIN-BINDING PROTEIN, 
PROFIUN. CYTOSKELETON 


ACTIN-BINDING PROTEIN 1 
ACTIN-BINDING PROTEIN, 
PROFDLIN, CYTOSKELBTON 


PC 

§ s § 

A« 03 

rK f\ r*K 

0 y u 

1 i 

n q ffi 

ill 

^* Ph 




as bs Q§ 

si ri^ss 


SIGNALING PROTEIN RUBlg 
UBIQUrnN-UKE PROTEIN, £7 
ARABIDOPSK, SIGNALING J* 
PROTEIN fy 


DE NOVO PROTEIN fU 
PROTEIN DESIGN, fij 


pound 




SAIN: NULL; 


< 




[NG PROTEIN 


IAIN: A, B; 


ffl 






ON: NULL; 




IKE PROTEIN 
IN: A; 




IN; CHAIN: A; 


Com 




PROFIUN; Q 


1 PROFIUN; a 




E? r-f 
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li 
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PROFIUN; a 
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t 1 HH 
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PDB annotation 


RECEPTOR, IMMUNE 


/> 


IMMUNE SYSTEM HUMAN 
TCR/PEPTIDE/MHC 
COMPLEX, HLA-A2, HTLV-1, 
TAX, TCR, T 2 CELL 
RECEPTOR. IMMUNE 


SYSTEM 


COMPLEX (COAT 
PROTEIN/IMMUNOGLOBULI 
N) POLYPROTEIN, COAT 


i £ 

0 < J cn - O 

s§ | |^| 
9itf5 S81 

3 ^ § P i! o 


RECEPTOR TCR; T-CELL, 
RECEPTOR, 
TRANSMEMBRANE, 
GLYCOPROTEIN, SIGNAL 








AMINOPEPTIDASE 
AMINOPEPTIDASE, 
PROLINE IMINOPEPTIDASE, 
SERINE PROTEASE, 2 


Compound 


CHAIN: C; HMAN T-CELL 


gw 

li 

h 




1 RECEPTOR; CHAIN: D; 
1 HLA-A 0201; CHAIN: E; 


I HUMAN RHINOVIRUS 14 
COAT PROTEIN; CHAIN: l f 
2,3,4;FAB17-IA; CHAIN: 
L,H 


ALPHA, BETA T-CELL 
RECEPTOR CHAIN: A, B; 
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PDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION, PROTEIN 1 1 
DESIGN, 2 CRYSTAL K 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) ! 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) V 


COMPLEX (ZINC f\ 
FINGER/DNA) ZINC FINGER. , 
PROTEIN-DNA % ? 
INTERACTION, PROTEIN jl 
DESIGN, 2 CRYSTAL t ? 
STRUCTURE, COMPLEX g \ 
(ZINC FINGER/DNA) £J 


COMPLEX (ZINC fli 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA r \ 
INTERACTION, PROTEIN f l 
DESIGN, 2 CRYSTAL IM 
STRUCTURE, COMPLEX RJ 
(22NC FINGER/DNA) R! 


1 


Compound 




DNA; CHAIN: A, B* D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 
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PDB annotation 


REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION| 
REGULATION/DNA) f 


COMPLEX (TRANSCRIPTION ' 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
; ELEMENT, YYl, ZINC 2 
1 FINGER PROTEIN, DNA- 


PROTEIN RECOGNITION, 3 <g 
COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) * 


2| § -g go ? 

lllillill 


S§ii 


Compound 


ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 

DNA; CHAIN: A, B, 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


6 g 

5 i 


INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


is 


INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 


INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


SEO FOLD 1 
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0.45 
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PDB annotation 


FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL 
TRANSDUCTIONS 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 


GROWTH FACTOR/GROWTH* 
FACTOR RECEPTOR FGF, \ 
FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; ^ 
NCAM, IMMUNOGLOBUUNfi 
FOLD, GLYCOPROTEIN J 


IP ia S 
iglgggj 

S i I s 1 1 § 

g g « pa g o g 

O £ fi* co Q 


jujube: 

ill 26 
Sglgfg 

£ pi a w o o 

go #8 gas 








15 
I 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C.D; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


UK 




Compound 


FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTORRECEPTORl; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B f 
D; FIBROBLAST GROW] 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, 
D; FIBROBLAST GROW! 
FACTOR RECEPTOR 2; 
CHAIN: E.F.G.H; 
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PDB annotation 


UKE, SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 


GROWTH FACTOR/GROWTH 1 
FACTOR RECEPTOR FGF, A 
FGFR, IMMUNOGLOBULIN- \ ] 
LIKE, SIGNAL 
TRANSDUCTIONS 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 


COMPLEX CD16; IGGl-FC 
COMPLEX, FC FRAGMENT, 
IGG, FC, RECEPTOR, CD16, 
GAMMA 


COMPLEX CD16; IGGl-FC 
COMPLEX, FC FRAGMENT, 
IGG, FC, RECEPTOR, CD 16, 
GAMMA 


S3B 
&|8 

S|8 


l§i 

ill 

ill 


CELL ADHESION NCAM; *jj 
NCAM, IMMUNOGLOBULIN^ 
FOLD, GLYCOPROTEIN ^ 


ill 
lp 

||| 


Si 

il 

Q C 












NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C,D; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B. 
C,D; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


u 


Compound 


FACTOR RECEPTOR 1; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


LOW AFFINITY 
IMMUNOGLOBULIN 
GAMMA FC RECEPTOR 
CHAIN: C; FC ERAGMEN 
OF HUMAN IGGl; CHAD 
A,B; 


LOW AFFINITY 
IMMUNOGLOBULIN 
GAMMA FC RECEPTOR 
CHAIN: C; FC FRAGMEN 
OF HUMAN IGGl; CHAD 
A,B; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, 






















h 

to © 




















a a 
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score 
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0.12 
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0.09 


0.34 
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0.49 
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PDB annotation 


Z »■"* 

18 
*2 

£ C 

"I 

Mi 


CELL ADHESION NCAM; 1 
NCAM, IMMUNOGLOBULIN d 
FOLD, GLYCOPROTEIN $ 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)UKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LDKE 
DOMAINS, B-TREFOIL FOLD 


III §3d 

^ § ^ ^ C5 CO 

S h 2 3 o o S 
o fL, S e-m co q 


111 §6§ 
Mff 

1 S 1 1 1 § 3 

> h 2 3 o 9 S 
ofife cm 55 a 


|i 

<i y ^ 

IS if 








NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 




OK 


O'K 


a g 




Compound 


IMMUNOGLOBULIN 
GAMMA FC RECEPTOR 
CHAIN: C; FC FRAGMEN 


1 


NEURAL CELL ADHESIC 
MOLECUUB; CHAIN: A, 1 
C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, 
D; FIBROBLAST GROW! 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, 
D; FIBROBLAST GROW! 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, 
D; FIBROBLAST GROW! 
FACTOR RECEPTOR 2; 
CHAIN: E, F, G, H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, 
D; FIBROBLAST GROW] 
FACTOR RECEPTOR 2; 


9 
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PDB annotation 


PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC ! 
FINGER/DNA) ZINC FINGER, 1 
PROTEIN-DNA f 
INTERACTION, PROTEIN 1 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTBIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN, 2 CRYSTAL «H 
STRUCTURE, COMPLEX f 5 
(ZINC FINGER/DNA) * \ 


J Ills 

lililil 


COMPLEX (ZINC ' 
FINGER/DNA) ZINC FINGER, * * 
PROTEIN-DNA M 
INTERACTION, PROTEIN f J 
DESIGN, 2 CRYSTAL f 1 
STRUCTURE. COMPLEX £ , 


Compound 


PROTEIN; CHAIN: C, F, G; 1 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


As 

lis 

J pd rri 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; OIAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 


PROTEIN; CHAIN: C, F, G; 
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PDB annotation 


TROPOMYOSIN COILED- 
COIL ALPHA-HELICAL, 
CONTRACTILE PROTEIN 


CONTRACTILE PROTEIN 
TROPOMYOSIN COILED- 


COIL ALPHA-HELICAL, 
CONTRACTILE PROTEIN 




CONTRACTILE PROTEIN 1 1 
TROPOMYOSIN COILED- 
COIL ALPHA-HELICAL, 
CONTRACTILE PROTEIN 




COMPLEX (NUCLEOCAPSID 
PROTEIN/RNA) 
NUCLEOCAPSID PROTEIN, 
COMPLEX (NUCLEOCAPSID 
PROTEIN/RNA), 2 STEM- 
LOOP RNA* v 


IP 


TRANSFERASE MRNA f 


TRANSFERASE, H 
TRANSCRIPTION, RNA- J 
BINDING, 2 *1 
PHOSPHORYLATION, V 
NUCLEAR PROTEIN, G 
ALTERNATIVE SPLICINO 3 R 
HELICAL TURN MOTIF, \ 
NUCLEOTIDYL p 
TRANSFERASE CATALYTIC £ 
DOMAIN ? 


TRANSFERASE MRNA H 
PROCESSING, H 
TRANSFERASE, fl 


Compound 


Q 
U 
» 


TROPOMYOSIN; CHAIN: A, 
B,C,D 






TROPOMYOSIN; CHAIN: A, 
B,C,D 




NUCLEOCAPSID PROTEIN; 




NUCLEOCAPSID PROTEIN 
HlV-1 NUCLEOCAPSID 
PROTEIN (MN ISOLATE) 
(NMR, 20 STRUCTURES) 
1AAF3 


! 

Si 




POLY(A) POLYMERASE; 
CHAIN: A; 


SEQFOLD 
score 
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PDB annotation 

■ - - 


TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE EL 2 i| 
TRANSCRIPTION [ l 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TOANSCRIPTION 
INmATION, INITIATOR «f j 
ELEMENT, YYl, ZINC 2 n 
FINGER PROTEIN, DNA- 1 j 
PROTEIN RECOGNITION, 3 f 1 
COMPLEX (TRANSCRIPTION'' 
REGULATION/DNA) W 


COMPLEX (TRANSCRIPTIONy [ 
REGULATION/DNA) YINO- C| 
YANOl; TRANSCRIPTION ni 
INITIATION. INITIATOR V 


co Q |o| 

iSslaB 

iPiii 

3&K89|8| 


Compound 




TFHIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl ; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 

YYl; CHAIN: C; ADENO- 


SEQFOLD 
score 
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0.53 
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PDB annotation 


ALKYLATION % 3 
PHOSPHORYLATION, 
CONTRACTILE PROTEIN c 


2 

i 
1 

P. 


PROTEIN, MYOSIN 
SUBFRAGMENT-1, MYOSIN 
HEAD, 2 MOTOR PROTEIN 


MUSCLE PROTEIN MUSCLE | 
PROTEIN, MYOSIN 1 
SUBFRAGMENT-1, MYOSIN 
HEAD, 2 MOTOR PROTEIN 




KINASE KINASE, SIGNAL ! 

TRANSDUCTION, 

CALCrUM/CALMODULIN 


KINASE KINASE, SIGNAL 

TRANSDUCTION, 

CALCIUM/CALMODULIN 


TRANSFERASE 
TRANSFERASE, 
SERINE/THREONINE- 
PROTEIN KINASE, CASEIN 
KINASE, 2 SER/THR KINASE 


PtU It / gb-Og/OlB 
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! 
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< 

j 




MYOSIN; CHAIN: A, B,C; 
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if 

IS 


i 

i 


DEPENDENT PROTEIN 
KINASE: CHAIN: NULL; 


p 

ii 


CHAIN: NULL; 


TRANSFERASECPHOSPHO 
TRANSFERASE) $C-/AMi>$- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
($OAPK$) 1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WITH SER 139 
IAPM 4 REPLACED BY 
ALA (/S139A$) COMPLEX 
WITH THE PEPTIDE IAPM 
5INHffirrORPKI(5-24) 
AND THE DETERGENT 
MEGA-8 IAPM 6 


TRANSFERASE(PHOSPHO 
TRANSFERASE) $C-/AMP$- 
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0.39 




0.54 




I 


Blast 




o 


o 




L7e-89 


S> 

3 


CO 


o 


o 






i— » 


ro ' 




00 

1-H 

co 


s 

cn 


i— • 


fH 
CO 


vq 

CO 


START 
AA 






NO 




o 


f-H 






CN 


s 


a 




< 


< 












W 






! 


! 




1 


la06 


o 


lapm 


i 


SEQID 
NO: 




ON 

o 


ON 
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»— I 




o 


o 


o 


o 


o 
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PDB annotation 




PROTEIN KINASE CDK2; 


CYCLE, 

PHOSPHORYLATION, 
STAUROSPORINE, 2 CELL 
DIVISION, MITOSIS, 
INHIBITION 


PROTEIN KINASE CDK2; 
PROTEIN KINASE, CELL 


CYCLE, 

PHOSPHORYLATION, 
STAUROSPORINE, 2 CELL «j? 
DIVISIONrMITOSIS, r 
INHIBITION % ! 


ST7 gl^Ukfe 




Compound 


DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
($(7APK$)1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WITH SER 139 


1APM 4 REPLACED BY 
ALA (/S139A$) COMPLEX 
WITH THE PEPTIDE 1APM 
5 INHIBITOR PKI(5-24) 
AND TOE DETERGENT 
MEGA-8 1APM6 


CYCLIN-DEPENDENT 
PROTEIN KINASE 2; 
CHAIN: NULL; 


CYCLIN-DEPENDENT 
PROTEIN KINASE 2; 


CHAIN: NULL; ! 


PHOSPHOTRANSFERASE 
CAMP-DEPENDENT 
PROTEIN KINASE 
CATALYTIC SUBUNTT 
1CMK3(E.C2.7.1.37) 
ICMK 4 


PHOSPHOTRANSFERASE 
CAMP-DEPENDENT 
PROTEIN KINASE 
CATALYTIC SUBUNTT 
1CMK3(E.C2.7.1.37) 
ICMK 4 

CASEIN KINASE- 1; ICSN 4 


1 SEO FOLD 1 


score 




112.67 






VO 
On 

NO 
t-H 


Pu 


score 






1.00 


1.00 


1 1.00 


1 


\ score 






0.34 


0.64 


0.49 


Psi 
Blast 




00 

i 


00 

I 


o 










t— « 


T— • 

cn 


OS ^ 


START 
AA 




t-H 




vo 


00 


1 CHAIN 1 


a 
















laql 


laql 


Icmk 

... 


Icmk 
lcsn 


SEQID 
NO: 




on 


o 

«— c 

«-H 


o 
»— < 

^H 


o o 

^H FH 
t— « <— t 
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1 



i 

8i 



fed 



; m 



CO 



go 

I 2 
S £ PQ 



o 

I 

o 
U 





1 



a.. 



W 
CO 
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J3 S 
> « 



—4 03 



3 



CO 

la 

0 



CO 



w 



t 



■ft 
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PDB annotation 


CELL DIVISION, MITOSIS, 
! PHOSPHORYLATION 


TRANSFERASE JNK3; 
TRANSFERASE, JNK3 MAP 
KINASE, 

SERINE/THREONINE 
PROTEIN 2 KINASE 


KINASE KINASE, TWITCHIN, J 
INTRASTERIC REGULATION " 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


TRANSFERASE MITOGEN 
ACTIVATED PROTEIN 
KINASE; TRANSFERASE, 
MAP KINASE, 


SERINE/THREONINE- 
PROTEIN KINASE, 2 P38 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; 
GLYCOGEN METABOLISM, 1} 
TRANSFERASE, f) 
SERINE/THREONINE- ff i 
PROTEIN, 2 KINASE, ATP- «. 1 
BINDING, CALMODULIN- - j 
BINDING 1 


KINASE RABBIT MUSCLE V ' 
PHOSPHORYLASE KINASE; 5) 
GLYCOGEN METABOLISM, fjj 
TRANSFERASE, \ 


JS 

ii 

|||| 


SERINE KINASE SERINE fl | 
KINASE, TITIN, MUSCLE. fl! 


Compound 

i 




C-JUNN-TERMINAL 
KINASE; CHAIN: NULL; 




TWITCHIN; CHAIN: A, B; 


PQ 


TWITCHIN; CHAIN: A, B; 


MAP KINASE P38; CHAIN: 
NULL; 


PHOSPHORYLASE 
KINASE; CHAIN: NULL; 


PHOSPHORYLASE 
KINASE; CHAIN: NULL; 


< 


SEQFOLD 
score 




127.21 

1 




CO 

<0 
os 

CO 






119.80 




170.32 


124.66 


PMF 
score 






1.00 




i 1.00 


1.00 




1.00 






Verify 
score 






0.48 




0.39 


0.36 




0.77 
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1 
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CO 
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& 
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o 

<N 


CO 
CN 
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CO 
vo 

h 
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00 
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oo 

1 


1 
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58 
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CO 




vo 

CO 
CO 


CO 
CO 


CN 

R 


CN 


i-H 

CO 
CO 


START 
AA 






vo 


CO 




r- 


«r> 


00 
i-H 


Os 


VO 


CHAIN 
ID 








< 


< 


< 








< 


B» 




Ijnk 


lkoa 


Ikob 


Ikob 


Ikob 


00 

Qu 


Iphk 


Iphk 


Itki 


SEQID 
NO: 
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o 
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o 
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no 


o 
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o|gg 

co < K 












1 



ii 

C to 



s 



o 



9 



21 



4 



o 
m 

6 



oo 



6* 



s 



a 



la 



a .. 



1 



60 



8 



328 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


I COMPLEX I 




STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD 
PROTEIN; VISUAL 
ARRESTIN, 


DEofiNoITIoATION Or THE 
VISUAL TRANSDUCTION 2 
CASCADE, BINDING TO 
ACTICATED AND 
PHOSPHORYLATED 
RHODOPSIN 


STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD 
PROTEIN; VISUAL 
ARRESTIN, 


DESENSITIS ATION OF THE 
VISUAL TRANSDUCTION 2 
CASCADE, BINDING TO 
ACTICATED AND 
PHOSPHORYLATED 
RHODOPSIN 


STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD - 
PROTEIN; VISUAL f 


DESENSinSATION OF THE * 
VISUAL TRANSDUCTION 2 , 
CASCADE, BINDING TO 
ACTICATED AND I 
PHOSPHORYLATED { 
RHODOPSIN J 




PROTEASE PROSOME, r 
MULTICATALYTIC ) 
PROTEASE, MCP, » 
MACROPAIN; PROTEASE, f 
PROTEASOMB, HYDROLASE f 


MULTICATALYTIC J 


i 
t 

o 
U 






1 ARRESTIN; CHAIN: A, B, C, 




ARRESTIN; CHAIN: A, B, C, 
D; 


ARRESTIN; CHAIN; A, B, C, 

i . 

! - 




P. 

Bo „ 


20S PROTEASOMB; 1 


SEQFOLD 
score 






73.18 




71.95 

• 

• 




71.75 


55.61 


PMF. 
score 








0.00 










Verify 
score 








-0.35 










Psi 
Blast 






1 

CS 


i— i 


i 




? 

cs 


m 

■>* 








oo 

CO 










1 soz 


START 
AA 






»— i 


58. 






CS 


1—1 


CHAIN 
ID 






< 


< 






« 


w 


la 






o 


i— 1 


lcfl 




lpma 


1 

i— * 


SEQID 
NO: 






f— 1 


1 

CS 
— 1 


3 






CS 
CS 
— ^ 
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PDB annotation 


PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, 
PROTEASE 


MULTICATALYTIC i 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, 
PROTEASE 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, 


I PROTEASE 1 


MULTICATALYTIC 1 
PROTEINASE f 
MULTICATALYTIC 
PROTEINASE. 20S * 
PROTEASOME, PROTEIN 2 j 
DEGRADATION, ANTIGEN C 
PROCESSING, HYDROLASE, ^ 
PROTEASE f 




! RIBOSOMAL PROTEIN 
RIBOSOMAL PROTEIN, . 
RRNA-BINDING * 


iii 

OOCp 
Hi 




Compound 


CHAIN: A, B, C, D, E, F, G, 
H,I,J,K,L,M,N,0,P,Q, 


?> 

**• 

* 9 5. 


w w -o 

O cj z . 
52 -S 
3« - . 


20S PROTEASOME; 

CHAIN:A,B,C,D,E,F,G, 

H,I,J,K,L.M,N.O,P,Q, 




RIBOSOMAL PROTEIN L9; 
CHAIN: NULL; 


a • 

i 

li 




SEQFOLD 
score 




84.34 


58.38 


52.75 






54.38 




PMF 
score 












0.82 






Verify 
score 












I 0.38 . 






Psi 
Blast 




1.5e-36 


5 

i-H 


oo 

1 

CO 




00 
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! 

OO 

vd 


















en 




START | 
AA 1 




r-i 








s 


35 




CHAIN 
ED 


















g9 




1 


Iryp 


i-H 




ldiv 


; ldiv 




SEQID 
NO: 








r- 1 
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PDB annotation 


HALOPEROXIDASE 
BROMOPEROXIDASE L, 
HALOPEROXIDASE L; 
HALOPEROXIDASE, 
OXIDOREDUCTASE 


HALOPEROXIDASE | 
CHLOROPEROXIDASEAl, J 
HALOPEROXIDASE Al; fl 
HALOPEROXIDASE, 1 
OXIDOREDUCTASE ( 


HALOPEROXIDASE 
HALOPEROXIDASE F; 
HALOPEROXIDASE, 
OXIDOREDUCTASE, 
PROPIONATE COMPLEX 


i 


AMINOPEPTIDASE, 
PROLINE IMINOPEPTIDASE, 
SHRINE PROTEASE, 2 
XANTHOMONAS 
CAMPESTRIS 


i 

I 


W 

I 

ill 

1 § frS 
MOW 

ill 


HALOPEROXIDASE "l 
HALOPEROXIDASE A2, Z ' 


CHLOROPEROXIDASE A2; k 
HALOPEROXIDASE, %\ 
OXIDOREDUCTASE, % 
PEROXIDASE, ALPHA/BETA f 
2 HYDROLASE FOLD, p 
MUTANT M99T t 


HYDROLASE BPHD; . 
HYDROLASE, PCB * 
DEGRADATION V- 


HYDROLASE A/B Ji 
HYDROLASE FOLD, f 
DEHALOOENASEI-SBOND <* 


Compound 


CHLOROPEROXIDASE L; 
CHAIN:A,B,C; 


• m 

|i 
( 




1 CHLOROPEROXIDASE F; 1 






PROLINE 


IMINOPEPTIDASE; CHAIN: 
A,B; 




DEHALOOENASE; CHAIN: 




BROMOPEROXIDASE A2; 
CHAIN: NULL; 




lis 

ill 

all 


HALOALKANE 
DEHALOGENASE; 1- 
CHLOROHEXANE CHAIN: 


SEQFOLD 
score 


68.67 


54.47 


50.66 


56.93 




71.01 


68.50 


63.30 


73.30 


PMF 
score 


















Verify 
score 
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PDB annotation 




PROTEIN KINASE CDK2; 

TRANSFERASE, 

SERINE/THREONINE 


PROTEIN KINASE, ATP- 
BINDING, 2 CELL CYCLE, 
CELL DIVISION, MITOSIS, 
PHOSPHORYLATION 


1 KINASE KINASE, TWITCHIN, 1 


INTRASTERIC REGULATION 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; 1 
GLYCOGEN METABOLISM, J 
TRANSFERASE, 
SERINE/THREONINE- ! 


PROTEIN, 2 KINASE, ATP- 
BINDING, CALMODULIN- 
BINDING 1 


t TRANSFERASE MAP f 


KINASE, 


#* 

1 


•EST 

if 

ll 


SERINE KINASE SERINE % 
KINASE, THIN, MUSCLE, jj 
AUTOINHIBITION f 




TRANSMEMBRANE 


PROTEIN COLICIN, *• 
BACTERIOCINJON r 
CHANNEL FORMATION, f 


Is 

is 


Compound 


TRANSFERASE(PHOSPHO 
TRANSFERASE) CAMP- 


DEPENDENT PROTEIN 
KINASE (B.C.2.7.1.37) 
(CAPK)1CTP3 
(CATALYTIC SUBUNTT) 
ICTP 4 • 


HUMAN CYCLIN- 


DEPENDENT KINASE 2; 
CHAIN: NULL; 




tt 

<d 




PHOSPHORYLASE 
KINASE; CHAIN: NULL; 


^| 




TITIN; CHAIN: A, B; 




COLICIN IA: CHAIN: 1 


NULL; 

■ 

• 


1 SEO FOLD 1 


score 


100.24 


78.81 


104.78 


110.63 


73.28 


75.92 

i 
i 
i 




cs 
«0 

r-t 


1 PMF 


score 
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i 


score 
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PDB annotation 


PHOSPHORYLATION I 


ACTIN-BINDING PROTEIN 
ACTIN-BINDING PROTEIN, 
CALCIUM-BINDING, 
PHOSPHORYLATION 


ACTIN-BINDING CALPONIN 1 
HOMOLOGY (CH) DOMAIN; 
FILAMENTOUS ACTIN- J 
BINDING DOMAIN, 9 
CYTOSKELETON J 


i— 


DYSTROPHIN, MUSCULAR 
DYSTROPHY, CALPONIN 
HOMOLOGY DOMAIN, 2 
ACTIN-BINDING, UTROPKtfN 


STRUCTURAL PROTEIN 
CALPONIN HOMOLOGY 
DOMAIN, DOMAIN 
SWAPPING, ACTIN 
BINDING, 2 UTROPMN, 
DYSTROPHIN, 


£ 

i 

CO 




EXTRACELLULAR MODULE 1 } 
OSTEONECTIN, SPARC f] 
SECRETED PROTEIN ACIDIC* . 
AND EXTRACELLULAR " 
MODULE, GLYCOPROTEIN, j 
ANTI-ADHESIVE PROTEIN, 2W 
COLLAGEN BINDING, SITE- U 
DIRECTED MUTAGENESIS, '(J 
GLYCOSYLATED 3 PROTEINf | 
MODRES 


Hb 


CHAPERONE HSP40; f" 
CHAPERONE, HEAT SHOCK, P ' 
PROTEIN FOLDING, DNAK f, 


MOLECULAR CHAPERONE fj 
HDJ-1; MOLECULAR p] 


Compound 






i 


SPECTRIN BETA CHAIN; 
CHAIN: A; 


DYSTROPHIN; CHAIN: A, 
B.C.D; 


UTROPHDST ACTIN 
BINDING REGION; CHAIN: 
A,B; 




« 


PROTEIN BM-40; CHAIN: 
A,B; 




i 




HUMAN HSP40: CHAIN: 1 


NULL; 


SEQFOLD 
score 














60.57 








PMF 
score 




0.04 


0.07 


-0.07 


-0.05 








0.96 


1.00 


Verify 
score 




-0.09 


0.03 


0.21 


0.03 
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PDB annotation 


SUBUNTT; GAMMAl, 
TRANSDUCIN GAMMA 


SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 
PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION 


COMPLEX (GTP- i 
BINDING/TRANSDUCER) | 
BETAl, TRANSDUCIN BETA \ 
SUBUNIT-, GAMMAl, 
TRANSDUCIN GAMMA 
SUBUNIT; COMPLEX (GTP- 
BINDING/TRANSDUCER). G 
PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION 


COMPLEX (GTP- 
BINDING/TRANSDUCER) 
BETAl, TRANSDUCIN BETA 
SUBUNTT; GAMMAl, 
TRANSDUCIN GAMMA 


SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 
PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION «f ■, 


OXIDOREDUCTASE f 
ENZYME, NITRITE * 
REDUCTASE, * ' 
OXIDOREDUCTASE. 
DENITRIEIC AHON, 2 1 
ELECTRON TRANSPORT, (I 
PERIPLASMIC • rt 


■tK 


is* s's 

8|9»8|8 
§ §< So 2 §s § 


Compound 


GAMMA; CHAIN: G; 


GT-ALPHA/GI-ALPHA 
CHIMERA; CHAIN: A; GT- 
BBTA; CHAIN: B;GT- 
GAMMA; CHAIN: G; 


GT-ALPHA/GI-ALPHA 
CHIMERA; CHAIN: A; GT- 
BETA; CHAIN: B;GT-, 
GAMMA; CHAIN: G; 


CYTOCHROME CDl 
NITRITE REDUCTASE; 
CHAIN: A, B; 




23S RRNA; CHAIN: 0; 5S 
RRNA; CHAIN: 9; 
RIBOSOMAL PROTEIN L2; 
CHAIN: A; RIBOSOMAL 
PROTEIN L3; CHAIN: B; 
RIBOSOMAL PROTEIN L4; 
CHAIN: C; RIBOSOMAL; 


SEQFOLD 
score 














PMF 
score 




0.19 


-0.05 


-0.17 




0.94 
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Verify 
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PDB annotation 


HL33; SOS RIBOSOMAL 


1 ! 

. CO Q 

5 si 


.<pP 


PROTEIN L44E, LA, HLA; 50S 
RIBOSOMAL PROTEIN L6P, 
HMAL6, HL10 RIBOSOME 


ASSEMBLY, RNA-RNA, 
PROTEIN-RNA, PROTEIN- 




w 


SERINE PROTEASE PCPA2; f\ 
SERINE PROTEASE. * 


ZYMOGEN, HYDROLASE f 1 


SERINE PROTEASE Jj 
PORCINE 1 
PROCARBOXYPEPTIDASB, W 
SERINE PROTEASE Q 


/ 


ir 


SIB 

Sf* CL HM 

1 

SB* 

QUO 


Compound 


s 


RIBOSOMAL PROTEIN 
L24; CHAIN: Q; 
RIBOSOMAL PROTEIN 
L24E; CHAIN: R; 
RIBOSOMAL PROTEIN 
L29; CHAIN: S; 


L30; CHAIN: T; 


L3 IE; CHAIN: U; 
RIBOSOMAL PROTEIN 
L32E; CHAIN: V; 
RIBOSOMAL PROTEIN 
L37AE; CHAIN: W; 
RIBOSOMAL PROTEIN 
L37E; CHAIN: X; 
RIBOSOMAL PROTEIN 
L39E; CHAIN: Y; 
RIBOSOMAL PROTEIN 
L44E; CHAIN: Z; 

CHAIN: 1; 




°i 

Is 




PROCARBOXYPEPTIDASE 

Tk. /-«nr A TXT. XTTTT T 


Q 


HYDROLASE(C- 
TERMINAL PEPTIDASE) 
PROCARBOXYPEPTIDASE 
A(B.C.3.4.12.2)1PCA3 




CENTROMERE PROTEIN 

r». ntT A TXT. A . 


f 


SEQFOLD 


score 1 






107.01 


NO 

© 


114.78 






PMF 


score 














0.83 
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PDB annotation 


I STRUCTURE | 




LIPID BINDINO PROTEIN 
APO-E3; LIPID TRANSPORT, 
LIPID TRANSPORT, 
HEPARIN-BINDING. 
PLASMA 2 PROTEIN, HDL. 
VLDL REMARK l 


COMPLEX (HSP24/HSP70) 1 
HSP70, GRPE, MOLECULAR 
CHAPERONE, NUCLEOTIDE 
EXCHANGE 2 FACTOR, 
CODLED-COBL, COMPLEX 
(HSP24/HSP70) 


ENDOCYTOSIS/EXOCYTOSI 
S NSECl; PROTEIN-PROTEIN 
COMPLEX, MULTI-SUBUNIT 


ENDOCYTOSIS/EXOCYTOSI 
SSYNAPTOTAGMIN 
ASSOCIATED 35 KDA 
PROTEIN. P35A, THREE 
HELK BUNDLE 


PICT,- 
oo 52 


CONTRACTILE PROTEIN 1 
TRIPLE-HELK COILED 1 
COIL, CONTRACTILE W 
PROTEIN U 


CONTRACTILE PROTEIN f ; 
TRIPLE-HELIX COILED \ 
COIL, CONTRACTILE * % 

protein r: 


iri 


LIPID TRANSPORT APO A-I; 1 * ! 
LIPOPROTEtN, LIPID fj j 
TRANSPORT, fi 


i 
! 






APOLEPOPROTEINE; 
CHAIN: A; • 


NUCLEOTIDE EXCHANGE 
FACTOR GRPE; CHAIN: A, 
B; MOLECULAR 
CHAPERONEDNAK; 
CHAIN: D; 


SYNTAXIN BINDING 
PROTEIN 1; CHAIN: A; 
SYNTAXIN 1A; CHAIN: B; 


SYNTAXIN- 1 A; CHAIN: A, 
B.C; . 


NUCLEAR FACTOR XNF7; 
CHAIN: NULL; 


HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 


HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 




APOLffOPROTEIN A-I; 
CHAIN: A, B, C,D; 


SEQFOLD 
score 


















59.40 




64.70 


PMF 
score 






0.27 


00 " 
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0.00 


0.11 


© 


0.51 








Verify 
score 






-0.14 




-0.00 


0.21 


-0.40 


0.16 
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PDB annotation 


PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 


h 

i\ 

1 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA J 
INTERACTION, PROTEIN '1 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC • 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL «j' 
STRUCTURE, COMPLEX * ! 
(ZINC FINGER/DNA) ' ' 


COMPLEX (ZINC H 
FINGER/DNA) ZINC FINGER, j 
PROTEIN-DNA t 
INTERACTION, PROTEIN { 
DESIGN, 2 CRYSTAL h 
STRUCTURE, COMPLEX 1 1 
(ZINC FINGER/DNA) \* 


COMPLEX (ZINC ' 
FINGER/DNA) ZINC FINGER, * J 
PROTEIN-DNA M 
INTERACTION, PROTEIN fj 
DESIGN, 2 CRYSTAL fjj 
STRUCTURE. COMPLEX » \ 


Compound 


PROTEIN; CHAIN: C. F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN; A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


1 DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 

■ 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZtNC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 














PMF 
score 




0.89 . 
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1.00 


! i.oo . 


8 


Verify 
score 
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PDB annotation 


! INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC | 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION, PROTEIN ij 
DESIGN, 2 CRYSTAL '1 
STRUCTURE, COMPLEX 
(ZINC HNGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX 1\ 
(ZINC FINGER/DNA) f\ 


1- J BS3E 

J Silt 

|j]§g|| 


COMPLEX (ZINC U 
FINGER/DNA) ZINC FINGER, 2 
PROTEIN-DNA JH{ 
INTERACTION, PROTEIN J 
DESIGN, 2 CRYSTAL fl) 
STRUCTURE, COMPLEX fl 
(ZINC FINGER/DNA) 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 


PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




DNA; CHAIN: A, B, D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 




100.48 . 
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1.00 


1.00 
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PDB annotation 




COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER,. 
PROTEIN-DNA 

INTERACTION, PROTEIN i 
DESIGN, 2 CRYSTAL 1 

Structure, complex I 

(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 


13 

1 


! COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION. PROTEIN 
DESIGN. 2 CRYSTAL 
STRUCTURE, COMPLEX °f 
(ZINC FINGER/DNA) p 


riVGPGMLItt 

III 

§jP|lg 


COMPLEX (ZINC %. 
FINGER/DNA) ZINC FINGER, f \ 
PROTEIN-DNA 

INTERACTION, PROTEIN r b 
DESIGN, 2 CRYSTAL fff 
STRUCTURE, COMPLEX fif 
(ZINC FINGER/DNA) ft | 


Compound 


MUTANT WITH CYS 11 
IBBO 3 REPLACED BY 
ABU(C11ABU)(NMR,60 


<r 
0 

3Q 

■Q 


DNA; CHAIN: A, B, D, E; 
CONSENSUS 2INC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAM: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F. G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQ FOLD 
score 














PMF 
score 




r. 


ON 

o 


1.00 


1.00 


1.00 


Verify 
score 




0.01 


0.14 


0.48 




0.79 
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PDB annotation 


(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTK 
REGULATION/DNA) TFTOA 
5S GENE; NMR, TFEttA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTIC 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTIC 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTIC 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTIC 
REGULATION/DNA), RNA 
POLYMERASE 111,2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 


I 


COMPLEX (TRANSCRIPTIC 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTIC 
REGULATION/DNA), RNA 
POLYMERASE DDE, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 


\ PROTEIN 


COMPLEX (TRANSCRIPTIC 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 


Compound 




TRANSCRIPTION FACTOR 
mA; CHAIN: A; 5S RNA 
GENE; CHAIN: E,F; 


TFHIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


TFfflA; CHAIN: A, D; 5S 
KtBOSOMALRNA GENE; 
CHAIN: B,C,E,F; 


THEA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 


h 

to o 


























115.35 










PMF 
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1.00 
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1.00 


Verify 
score 
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PDB annotation 


PROTEIN/DNA) FIVE- 
FINGER GU;GLI f ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- i 1 
BINDING PROTEIN/DNA) I 




TRANSFERASE MRNA 
PROCESSING, 


TRANSCRIPTION, RNA- 
BINDING, 2 
PHOSPHORYLATION, 
NUCLEAR PROTEIN, 
ALTERNATIVE SPUCING 3 
HELICAL TURN MOTIF, 
NUCLEOTIDYL 
TRANSFERASE CATALYTIC 
DOMAIN 




COMPLEX (ZINC "J 
FINGER/DNA) ZINC FINGER, f ? 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL \ 
STRUCTURE. COMPLEX )\ 
(ZINC FINGER/DNA) (f 


COMPLEX (TRANSCRIPTION | ! 
REGULATION/DNA) f[ 
COMPLEX (TRANSCRIPTION t f 
REGULATION/DNA), RNA f 3 
POLYMERASE m, 2 J J 
TRANSCRIPTION 
INITIATION, ZINC FINGER f J 
PROTEIN f i 


1 COMPLEX (TRANSCRIPTION p 


Compound 


GLI1; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 




POLY(A) POLYMERASE; 
CHAIN: A; 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


to 5 
in g 

qS 
Y. O ^ 


YYl; CHAIN: C; ADENO- | 


SEQFOLD 
score 
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0.92 




LOO 










Verify 
score 
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PDB annotation 


REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 

initiation, initiator 

ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION. 3 
COMPLEX (TRANSCRIPTIOim 
REGULATION/DNA) T 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 




COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC f 
FINGER, DNA-BINDING «g 
PROTEIN 


COMPLEX (ZINC T| 
FINGER/DNA) COMPLEX fB 
(ZINC FINGER/DNA), ZINC C 
FINGER, DNA-BINDING (fl 
PROTEIN J 

rip 


o« "<5os 

gg g g S|g 

S M H 2l § S rj 

u E £ a o oo fci 


Compound 


ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C,D; 




QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZLNC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 




96.09 












PMF 
score 
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0.43 


0.12 
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Verify 
score 
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PDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DN A) 


COMPLEX CZINC 
FINOER/DNA) ZINC FINGER, 
PROTEIN-DNA . 
INTERACTION, PROTEIN m 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZJNC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) • 8 Q 


COMPLEX (ZINC ft 
FINGER/DNA) ZINC ETNGERIJ 
PROTEIN-bNA ,1, 
INTERACTION, PROTEIN M 
DESIGN, 2 CRYSTAL C 
STRUCTURE, COMPLEX (ft 
(ZINCFINGER/DNA) B 


§ 

1 I 1 

u R " w 0 2 

illilll 

8liilII 


E 

i 


Compound 






..Id 

i 

lii 

fluS 






DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C f F, G; 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




1 DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


w 
Q 
m 


SEQFOLD 
score 
















PMF 
score 




1.00 


8 
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1.00 
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1.00 | 


Verify 
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PDB annotation 


INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTK 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTK 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTK 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTK 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTK 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTK 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


I PROTEIN - 


COMPLEX (TRANSCRIPTK 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTK 
REOULATION/DNA), RNA 
POLYMERASE EI, 2 






CO g 


CO g 


CO § 


CHAIN: A, D; 5S 
)MAL RNA GENE; 
B,C,E,F; 




oi« 


Compound 


















TFHIA; 
RIBOSC 
CHAIN: 


o ^ 


^| W ^ 




ill 


Q 
















SEQFOI 
score 






























PMF 
score 
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0.43 


Verify 
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t 

< 

c 
1 

a 

s 


3 

s 

5 

3 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 




IMMUNOGLOBULIN A 
IMMUNOGLOBULIN, KAPPA 
LIGHT-CHAIN DIMER 
HEADER 


COMPLEX 

(ANTIBODY/ANTIGEN) FAB- 
12; VEGF; COMPLEX 
(ANTIBODY/ANTIGEN), 
ANGIOGENIC FACTOR 


COMPLEX (HUMANIZED 
ANTIBODY/HYDROLASE) 


MURAMIDASE; 
HUMANIZED ANTIBODY, 
ANTIBODY COMPLEX, FV, 
ANTI-LYSOZYME, 2 
COMPLEX (HUMANIZED 
ANTIBODY/HYDROLASE) 


IMMUNE SYSTEM RETV, *ty 
STABILIZED f) 
IMMUNOGLOBULIN J 
FRAGMENT, BENCE-JONES)"! 
2 PROTEIN, IMMUNE J 
SYSTEM % 


see sn: 
g 

| 

fyj cN 

is 


IMMUNE SYSTEM FAB-IBP ^ 
COMPLEX CRYSTAL ftj 
STRUCTURE 2.7A fU 
RESOLUTION BINDING 2 


Compound 


23NC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 




2 

c 


CHAIN.-A.B; 




FAB FRAGMENT; CHAIN: 
L,H, J, AVASCULAR 
ENDOTHELIAL GROWTH 
FACTOR; CHAIN: V,W; 


HULYSl 1; CHAIN: A, B, D, 
B; LYSOZYME; CHAIN: C, 
F; 


IG KAPPA CHAIN V-I 
REGION REI; CHAIN: A, B; 


CAMPATH-IH-JLIGHT 
CHAIN; CHAIN: L; 


CAMPATH-lHrHEAVY 
CHAIN; CHAIN: H; 


PEPTIDE ANTIGEN; 
CHAIN: P; 


IGM RF 2A2; CHAIN: A, C, 
E; IGM RF 2A2; CHAIN: B. 
D, F; IMMUNOGLOBULIN 
G BINDING PROTEIN A; 


SEQFOLD 
score 


















a 
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0.43 


0.17 
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score 
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PDB annotetioii 


OUTSIDE THE ANTIGEN 
COMBINING SITE 


[«1 

i 

1 


1 

a* 

CO 


IMMUNE SYSTEM 1 
IMMUNOGLOBULIN FOLD, 1 
ANTIBODY, IGM, FV M 














J. 


g *£ VJ S 

a § i ^ 


J 


i 

! 


CHAIN: G,H; 


IGM MEZ 

IMMUNOGLOBULIN; 
CHAIN: L; IGM MEZ 
IMMUNOGLOBULIN; 
CHAIN: H; 


IMMUNOGLOBULIN FV 
FRAGMENTOFA 
HUMANIZED VERSION OF 
THE ANTI-CD18 IFGV 3 
ANTIBODY H52' (HUH52- 
AAFV) IFGV 4 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN M 
(IG-M) FV FRAGMENT 
1IGM3 


IMMUNOGLOBULIN FAB 
FRAGMENTOFA 
HUMANIZED VERSION OF 
THE ANTI-CD18 2FGW 3 
ANTIBODY H52' (HUH52- 
OZFAB)2FGW4 




s 

ft 

I 

a 


REDUCTASE (E.C.1 J.1.3) 

complex wrra folate 

1DRF3 


i 

1 

a 


REDUCTASE (E.C.1.5.L3) 
COMPLEX WITH FOLATE 
1DRF3 




1 

1! 




SEQFOLD 
score 
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PDB annotation 






COMPLEX (DNA-BINDING 
PROTEIN/DNA) 




Mil 

O 5 Q w ^ 
y >y H yj , 


/ 


sis 2 


Compound 


DNA-BINDING HIGH 
MOBIUTY GROUP 
PROTEIN FRAGMENT-B 
(HMGB) (DNA-BINDING 
IHME 3 HMG-BOX 
DOMAIN B OF RAT HMGl) 
(NMR, 1 STRUCTURE) 
IHME 4 


DNA-BINDING HIGH 
MOBILITY GROUP 
PROTEIN FRAGMENT-B 
(HMGB) (DNA-BINDING 
IHME 3 HMG-BOX 
DOMAIN B OF RAT HMGl) 
(NMR, 1 STRUCTURE) 


I IHME 4 | 


HUMAN SRY; IHRY 6 
CHAIN: A; 1HRY7DNA; 
IHRY 9 CHAIN: B; IHRY 10 


DNA-BINDING HIGH 
MOBILITY GROUP 
PROTEIN 1 (HMGl) BOX 2, 
COMPLEXED WITH IHSM 
3 MERCAPTOETHANOL 


AVERAGE STRUCTURE) 
IHSM 4 • 


* 

lie 

.■apis* 




FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G.H; 


SEQFOLD 
score 
















PMF 
score 


0.88 


0.90 


0.16 
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o 


0.87 




0.11 


Verify 
score 


0.34 
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0.10 
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PDB annotation 


| SUBGROUP WITHIN IG-LIKE 1 


! DOMAINS, B-TREFOILFOLD | 






LIGASE CBL, UBCH7, ZAP- ^ 
70,E2,UBIQUITIN,E3, 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 
UBIQUTTINATION, PROTEIN 
DEGRADATION, 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER 
(C3HC4) 


DNA-BINDING PROTEIN 
V(D) J RECOMBINATION 
ACTIVATING PROTEIN 1; 
RAG1,V(D)J 
RECOMBINATION, 
ANTIBODY, MAD, RING m 
FINGER, 2 ZINC BINUCLEAje 
CLUSTER, ZINC FINGER, H 
DNA-BINDING PROTEIN Hi 


9 


g 

Si Ha 

|||p||i 


PI 
III 


Compound 






VIRUS EQUINE HERPES 
VlRUS-1 (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, 1 
STRUCTURE) ICHC 4 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUITIN- 
CONJUGATING ENZYME 
E12-18 KDA UBCH7; 
CHAIN: C; 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 


RAGl; CHAIN: NULL; 








INTEGRASE; CHAIN: A, B, 




SEQFOLD 
score 
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PDB annotation 


PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) tft 


COMPLEX (TRANSCRIPTION" 9 
REGULATION/DNA) TFIHA; 
5S GENE; NMR, TFIHA, 
PROTEIN,~DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) ! 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANbCKirllUJN 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER^ 
PROTEIN m 


COMPLEX (TRANSCRIPT^ 
REGULATION/DNA) fU 
COMPLEX (TRANSCRIPTI0ffl| 
REGULATION/DNA), RNA CT 
POLYMERASE IH, 2 tf) 
TRANSCRIPTION gj 
INITIATION, ZINC FINGER pj 
PROTEIN I 


COMPLEX (TRANSCRIPTION 
REGULATTON/DNA) » 


COMPLEX (TRANSOUPTIQN 
REGULATTON/DNA), RNA fjj 
POLYMERASE Ctt, 2 jftj 
TRANSCRIPTION jg 


Compound 




g 
p 


GKNE; CHAIN: E,F; 


TFIHA; CHAIN: A,D;5S 
RIBOSOMAL RNA GENE; 


p$ 
u 
« 


TFHIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFIHA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


j 




g 




60.98 








i 




score | 






i 


0.75 


CM 

d 


Verify 


score | 






© 


0.09 


0.37 


Psi 
Blast 




m 




1.4e-30 


i-t 

00 
VO 








! 


00 
»-l 


o 

00 

cn 


START 
AA 




s 


s 


VO 
<N 


00 




s 






< 


<! 




§a 




§ 






ltf6 


SEQID 
NO: 




1 


8 

cn 


VO 
O 
CO 


VO 

o 
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CO 
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o 
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m 
O 

o 



§5 



CO 

d 
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PDB annotation 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 


YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR _ 
ELEMENT, YYl, ZINC 2 I 
FINGER PROTEIN, DNA- ^ 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


TRANSCRIPTION 
REGULATION 
TRANSCRIPTION 
REGULATION, ADRl, ZINC 
FINGER, NMR 


TRANSCRIPTION 
REGULATION 
TRANSCRIPTION 
REGULATION, ADRl, ZINC 
FINGER, NMR 


1 TRANSCRIPTION 1 


REGULATION 

TRANSCRIPTION ^ 
REGULATION, ADRl, ZINC fl 
FINGER, NMR _i 


*wso 

g g g g D| 
p ^ p ^ 


Si 


ll 
r 

'PI 


i 

*9 Eg ft 

Hi 

sll 


Compound 




YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT * 
DNA; CHAIN: A, B; 




ADRl; CHAIN: NULL; 


ADRl; CHAIN: NULL; 


1 ADRl; CHAIN: NULL; 




ADRl; CHAIN: NULL; 


ADRl; CHAIN: NULL; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


SEQ FOLD 
score 






51.79 














score 




1.00 




0.25 


0.09 


0.09 


-0.01 




-0.15 


1 


score 




ZZ'O 




-0.35 


-0.30 


0.41 


0.19 


-0.00 




Blast 




? 


vn 

OO 
VO 


v> 
t— 1 

oo 
VO 


oo 
o 

% 

oo 


*-« 

4 

cn 


CN 

? 


l-H 

cn 






i 


CN 


f— » 


o 

o\ 




NO 

in 
cn 


in 

$ 


START 
AA 






oo 


VO 


o\ 

1-4 


00 


in 
i—i 
cn 


ss 

i— i 




a 




U j 












< 


la 




lubd 




2adr 




I 2adr I 




2adr 


2adr 


2gli 


SEQID 
NO: 




cn 


VO 
O 

cn 


VO 
O 

cn 


s 

cn 


1 


8. 
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PDB annotation 


P24 .,, I 


< co ^ 

|j ^ § ^ 


COMPLEX (MHCWIRAL 11 
PEPTIDB/RECEPTOR 


IMMUNOGLOBULIN 
HUMAN FAB, ANTI- 
TETANUS TOXOID, HIGH 
AFFINITY, CRYSTAL 2 
PACKING MOTIF, 
PROGRAMMING 
PROPENSITY TO 
CRYSTALLIZE, 3 
IMMUNOGLOBULIN 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN, 
ANTIBODY, FAB, ENZYME 
INHIBITOR, PCR, 2 HOT j 
START 




CATALYTIC ANTIBODY C? 
CATALYTIC ANTIBODY, f# 


CARBOCATION, 2 g 
CYCUZATION CASCADE H 


lip 

ilii 


Compound 




HIA-A 0201; CHAIN: A; 
BETA-2 MICROGLOBULIN; 
CHAIN: B; TAX PEPTIDE; 
CHAIN: C.TCELL 
RECEPTOR ALPHA; 
CHAIN: D;T CELL 
RECEPTOR BETA; CHAIN: 
E; 


FAB B7-15 A2; CHAIN: L, H; 




S3 
1 

IX 


1 


COMPLEX 

(ANTIBODY/ANTIGEN) 


HYHEL-5 FAB 
COMPLEXED WITH 
BOBWHTTE QUAIL 
LYSOZYME IBQL 3 IBQL 
95 


CATALYTIC ANTIBODY 
19A4 (LIGHT CHAItt); 
CHAIN: L; CATALYTIC 
ANTIBODY 19A4 (HEAVY 
CHAIN); CHAIN: H; 


FAB ANTIBODY LIGHT 


1 


CHAIN; CHAIN: H; I 


3 


score 1 




52.66 


51.01 


53.04 




o 




PMF j 


score 1 










0.18 




0.30 


! 


score 










0.00 




0.10 




Blast I 




0.0036 


r— 1 
00 


1.5e~15 


00 
r-l 

6 


3.4e-17 


ON 
i—l 

"0 

1-4 








o 

*— i 

CN 


VO 
»— i 

CN 


§ 


00 

cn 


i— 1 

CN 


START | 


AA j 






cn 
^4 




CO 

cn 




i-H 

cn 


CHAIN | 


a 




« 




W 


w 


a 


» 


la 




1 


laqk 


layl 


lbql 






SEQID 
NO: 




S 

CO 


o 

CO 


S 
m 


B 

CO 


s 

cn 


© 

cn 
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PDB annotation 


CELL ADHESION NEURAL 
CELL ADHESION 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN. 
LIKE, SIGNAL . 
TRANSDUCTION^ ffl 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 




VIRUS AORAL PROTEIN, 
RECEPTOR CD155, PVR, 
HUMAN POLIOVIRUS, 
ELECTRON MICROSCOPY, 2 
POLIOVffiUS-RECEPTOR 
COMPLEX, VIRUS/VIRAL ^ 
PROTEIN, RECEPTOR R 


his 


/Oli 

ii" 

jji 


BSE 

* i 

^ ^ ^ 


T 

5 

i 


3 

I 

; 


AXONIN-1; CHAIN: A; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


IMMUNOGLOBULIN FAB' 
FRAGMENT OF THE DB3- 
ANTI-STEROID 
MONOCLONAL 
ANTIBODY IDBB 3 aGGl, 
SUBGROUP 2A, KAPPA 1) 

complex wrra 

PROGESTERONE IDBB 4 


POLIOVIRUS RECEPTOR; 
CHAIN: R; VPl; CHAIN: 1; 


VP2; CHAIN: 2; VP3; 
CHAIN: 3; YP4; CHAIN: 4; 


ANTI-LYSOZYME 
ANTIBODY HYHBL-63 
(LIGHT CHAIN); CHAIN: A, 
C;ANTI-LYSOZYME 
ANTIBODY HYHBL-63 
(HEAVY CHAIN); CHAIN: 
B,D; 


IGG ANTIBODY (LIGHT 
CHAIN); CHAIN: L; IGG 
ANTIBODY (HEAVY 
CHAIN); CHAIN: H; 


FAB NC10.14 - LIGHT 
CHAIN; CHAIN: L, A; FAB 
NC10.14 - HEAVY CHAIft; 


SEQFOLD 
score 
















a 


score 


0.48 


0.09 


0.05 


0.31 


0.24 


CO 
00 

d 


0.69 


Verify 
score 


o 

9 


0.01 


-0.40 


9 


S 
9 


0.19 


s 

o 


Psi 
Blast 


CO 


r- 
in 


t— • 
6 

in 


1.7e-25 


0\ 

6 


CO 


vo 




*—4 






oo 






<N 
»— i 


START 
AA 


CM 


p; 










VO 
CO 






< 


o 


W 




a 






M 


NO 

3 


lcvs 


ldbb 


ldgi 


ldqq 


! 


letz 


SEQID 
NO; 
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CO 
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CO 
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o 

CO 


CO 
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lWsiB7iiB|2E[ 
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O 2! 
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CO 



si 
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CO 
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§ 3$ S 25 
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s 
o 

1 



g 




CM 

l8s"i 



*v| Q _ < 



3 S 

nip 



o o I 



CO 

i 



33 



I 



a 



eg 

S3 




3 

d 



1 




8 




4 



8 




a93 



8 o 



8 ! 



ON 
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d 



9 



416 



WO 02/081731 



PCT/US02/01222 



S3 

© 

a 
S 



n 

s 



1 








« m ° * 
O S w o Q] 

3 go, 





is 



to 



d 



3 

o 



f I 



3 



O 



s 1 
w 



Is 



CO 



co 



CO 



2 



3 



8 



3 



5 
4 



CO 
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o 

CO 



o 
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PDB annotation 


IMMUNOGLOBULIN 
FRAGMENT, BENCEJONES 
2 PROTEIN, IMMUNE 
SYSTEM 


IMMUNE SYSTEM FAB-IBP 
COMPLEX CRYSTAL 
STRUCTURE 2.7A M 
RESOLUTION BINDING 2 ^ 
OUTSIDE THE ANTIGEN 
COMBINING SITE 
SUPERANTIGENFAB VH3 3 


1 SPECIFICITY 1 




IMMUNE SYSTEM 
IMMUNOGLOBULIN FOLD, 
ANTIBODY, IGM, FV 


IPE 




.B* 




iS5i! 
1 


Compound 




IGM RF 2A2; CHAIN: A, C, 
E; IGM RF 2A2; CHAIN: B, 
D, F; IMMUNOGLOBULIN 
G BINDING PROTEIN A; 
CHAIN: G,H; 


a 

3 CO 

p 


IGMMEZ 

IMMUNOGLOBULIN; * 
CHAIN: L; IGMMEZ 
IMMUNOGLOBULIN; 
CHAIN: H; 


IMMUNOGLOBULIN FV 
FRAGMENT OF A 
HUMANIZED VERSION OF 
THE ANTI-CD 18 IFGV 3 
ANTIBODY H52' (HUH52- 
AAFV)1FGV4 


IMMUNOGLOBULIN FV 
FRAGMENT OF A 
HUMANIZED VERSION OF 
THE ANTI-CD18 IFGV 3 
ANTIBODY H52' (HUH52- 
AAFV) IFGV 4 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN M 


(IG-M) FV FRAGMENT 
1IGM3 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN M 
(IG-M) FV FRAGMENT 
1IGM3 


SEQFOLD 
score 










55.38 




52.10 


• 


PMF 
score 




00 

© 


© 
o 
^4 


© 




© 




© 
© 


Verify 
score 




© 


co 
© 


8 

o* 




© 




© 


Psi 
Blast 






3.4e-46 


! 




L7e-46 . 


I.. 

1-4 


1 

1-4 






CO 

l-H 


en 

i-H 


CO 

i-H 

1—1 




CO 

1-1 


00 

i-» 


CO 
t-4 
i-4 


START 
AA 




a 




a 






a 




CHAIN 
ID 




< 


•J 




►J 




•j 




la 




1 


r-H 


*a« 
•a 

«— » 


go 

f-l 


& 


I 


1 

1-1 


SEQID 
NO: 




© 
co 


© 
*— • 

co 


© 

CO 


© 

1—1 

CO 


© 

CO- 


© 

r-4 
CO 


© 

I— « 
CO 
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PDB annotation 


5 


> 


TRANSFERASE 

TRANSFERASE, 

GLUTATHIONE, 

CONJUGATION, 

DETOXIFICATION, 2 

CYTOSOUCDIMBR 




1 GENE REGULATION POZ i 
DOMAIN: PROTElN- 


PROTEIN INTERACTION 
DOMAIN^ ^ 
TRANSCRIPTIONAL 2 R 
REPRESSOR, ZINC-FINGER J 
PROTEIN, X-RAY 7m\ 
CRYSTALLOGRAPHY, 3 JU 
PROTEIN STRUCTURE, 
PROMYELOCYTIC (fl 

T T7T iTT A OT7TVTD F^t 


REGULATION fH 


,/OI 


B 


COMPLEX (ZINC fy 
FINGER/DNA) COMPLEX 


Compound 


A IGNE 3 CONSERVED 
NEUTRALIZING EPITOPE 
ON GP41 OF HUMAN IGNE 
4 IMMUNODEFICIENCY 
VIRUS TYPE 1, 
COMPLEXBDWITH 
GLUTATHIONE IGNE 5 


GLUTATHIONE 
TRANSFERASE 
GLUTATHIONES- 
TRANSFERASE 
(E.C.23.1.18)(26KDA) 
1GTA3 


GLUTATHIONES. 
TRANSFERASE; CHAIN: A, 
B.C.D; 




PROMYELOCYTIC 
LEUKEMIA ZINC FINGER 
PROTEIN PIZF; CHAIN: A; 


OXIDOREDUCTASE(OXYG 
EN(A)) GALACTOSE 
OXIDASE (E.C1.1.3.9) (PH 
4.5) IGOF 3 




QOSRZDSfC FINGER 
PEPTIDE; CHAIN: A; 


SEOFOLD 1 


' score 1 


















PMF 1 


score 




0,24 


-0.01 




0.98 


0.01 




0.36 


1 
* 


score ! 




-0,04 


0.15 




0.19 


0.25 




-0.01 


Psi 
Blast 




co 

1-H 

t 

o 

i-H 

IO 


<o 

1-H 




$ 


1.7e-12 




m 

1-H 

* 
CN 






1-H 


OO 




NO 
i-H 


»o 

i-H 

VO 




§ 


START 
AA 




«n 


co 






s 

CO 




00 

1-H 
i-H 




a 






< 




< 






< 


la 




lgta 


ft 

i-H 




lbuo 


Igof 




lalh 


SEQID 
NO: 




p* 

to 


^H 
i-H 
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i-H 
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f-H 
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PDB annotation 


I (ZINCFINGER/DNA) I 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL A . 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) | 


COMPLEX GZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
mTERACTIO^, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 


DP 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 1 i 
PROTEIN-DNA m 
INTERACTION, PROTEIN * 
DESIGN, 2 CRYSTAL * 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) £| 


liP -PI 

ililpi 

U E Ah W Q CO w 


COMPLEX (ZINC fll 
FINGER/DNA) ZINC FINGER, fl \ 
PROTEIN-DNA pi 


Compound 




of 

ii 

5 o 


6* 

1 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN: CHAIN: C. R Ch 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN: CHAIN: C. R G: 




DNA; CHAIN: A, B, D t E; 
CONSENSUS ZINC FINGER 


6* 

J 

i 


i 


score 


















score 




0.46 


LOO 


1.00 


© 
p 


1.00 


1.00 


Verify 
score 




R 
d 


0.31 


0.20 


0.22 


0.24 


0.30 


Psi 
Blast 




i 

00 


1.7e-50 | 


A 

CO 


i 

CO 


00 


i— » 
vi 






i-H | 

co 

CO 


0\ 

\n 

CO 


S5 

CO 




5 


»— i 

!? 


START 
AA 




si 


00 

55 


s 

CO 


s 

m 


8 

CO 


g 

to 


5 
8 




a 


u 


a 


U 


U 


(J 






Imey 


Imey 


1 


Imey 


Imey 




5" 
B 

«—» 


SEQID 
NO: 
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9 s < f 



"7 IUt iC \C 

5 P 




1 







g 



w 

CO 



2 



9 



4 



I 



9 



CO 
CO 
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PDB annotation 


1 PROTEIN j 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE ffl, 2 
TRANSCRIPTION 4 
INITIATION, ZINC FINGER J 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE ffl, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER fi 
PROTEIN f 


COMPLEX (TRANSCRIPTION e 
REGULATION/DNA) YING- * 
YANG 1; TRANSCRIPTION . 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 I 
FINGER PROTEIN, DNA- ( 
PROTEIN RECOGNITION, 3 fj 
COMPLEX (TRANSCRIPTION < 
REGULATION/DNA) * 


COMPLEX (TRANSCRIPTION J 
REGULATION/DNA) YING- * 
YANG 1; TRANSCRIPTION I 
INITIATION, INITIATOR f 
ELEMENT, YYl, ZINC 2 fj 


Compound 




TFmA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFIIIA; CHAIN: A, D;5S 
RIBOSOMAL RNA GENE; 


CHAIN: B,CE,F; 


z 
n n 

a 8 

zSpf 

ill 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


SEQFOLD 
score 




m 










PMF 
score 






LOO 


• 

H 


0.21 

- 


0.00 


Verify 
score 






0.37 


o 


-0.23 


-0.17 


Psi 
Blast 




O 
CO 

A 


o\ c 

CO c 
0 


* 

o 


OS 

6 

1—4 


l 

CO 






m 

00 


«o 0 
co c 
^ c 


o 




s 

CO 


START 

1 AA 




co 


CO V 
CO 0 


% 


ft 


oo 

VO 
»— J 


CHAIN 
ID 




< 


< 


c 


U 


u 


ge 




»— 1 






lubd 


lubd 
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PDB annotation 


FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 ! 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- J 
YANG 1; TRANSCRIPTION 1 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 


YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 1 
YANG 1; TRANSCRIPTION f 
INITIATION, INITIATOR * 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- J 
PROTEIN RECOGNITION, 3 * 
COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) C 


| COMPLEX (TRANSCRIPTION f 
REGULATION/DNA) YING- *> 
YANG 1; TRANSCRIPTION » 
INITIATION, INITIATOR 7 
ELEMENT, YYl, ZINC 2 r 

1 FINGER PROTEIN, DNA- f 
PROTEIN RECOGNITION, 3 f 
COMPLEX (TRANSCRIPTION f 


Compound 




YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 


DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 

• • 


SEQFOLD 
score 










79.23 


p. 


1 score 




0.98 


© 


1.00 




Verify 
score 




-0.12 


0.06 


0.02 

. 




2 


Blast 




co 

«n 
i— < 


le>34 


? 

00 

vd 


9 

CO 


1 END 


1 AA 




On 
«o 
en 


t- 

00 

co 




§ 


START 
AA 




§ 




CO 


s 

CO 




S 




U 


U 


a 


a 


I PDB 


9 




lubd 


lubd 




lubd 




lubd 


SEQID 
NO: 




en 

CO 


CO 
CO 


CO 
CO 


CO 
CO 
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is 
Hi 



o 

I 





p 2 

M 



s i 



S3 

CO 



OS 



00 



o 
r- 



S 



a 



la 



CO 





6 g 

lis* 

of S -e 



1 

5; 

H I 



8 



5? 



u 



2 



CO 



8 



CM 

8 



oo 



1 



8 



1 
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SBSB 




3 
O 

I 
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2 J 



3 

a* 



a 



•8 







5 



s 



8 



o 
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PDB annotation 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 1 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG Is TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 


YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR "| 
ELEMENT, YYl, ZINC 2 f 
FINGER PROTEIN, DN A- 1 
PROTEIN RECOGNITION, 3 * 
COMPLEX (TRANSCRIPTION 1 
REGULATION/DNA) £ 




-tar; 


COMPLEX (DNA-BIND1NG Jt 
PROTEIN/DNA) FIVE- fl 
FINGER GU; GU, ZINC ft 
FINGER, COMPLEX (DNA- A 


Compound 




YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 


i « 

ii 

ii 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 


li 


YY1; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 


INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


o 

co ^ g ^ 

§§11 


PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


||c 




S 


score 




91.94 












score 






0.99 


vo 

00 

© 


0.12 


1.00 


i 


score 






-0.12 






o 
o 

9 






On 

ON 


s 

CO 
CO 




ON 


•s 

00 

vo 






m 
co 




I 


VO 


VO 

cs 


START 
AA 




CM 


00 


o 

CO 


o 
m 


CO 
CO 


1 CHAIN ! 


a 




U 


o 


u 


< 


< 






lubd 


lubd 


lubd 




2gU 


SEQID : 

NO: I 




r- 
«— « 

CO 


CO 


t 

CO 


CO 


1-4 

CO 
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441 
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2? 







i 



I 



1 



fa £ 



f 





8 



83 



00 

d 



3 

NO 
NO 



Is 



8 



d 



9 





.s 




s 
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PDB annotation 


IMMUNOGLOBULIN FOLD, 
ALTERNATIVE SPLICING, 
SIGNAL, 3 MUSCLE 


I 




GLYCOPROTEIN CD4; 
IMMUNOGLOBULIN FOLD, 
TRANSMEMBRANE, 
GLYCOPROTEIN, T-CELL, 2 
MHC LIPOPROTEIN, 
POLYMORPHISM 


MUSCLE PROTEIN 
IMMUNOGLOBULIN 
SUPERFAMELY, I SET, 
MUSCLE PROTEIN 


IMMUNE SYSTEM P58 
NATURAL KILLER CELL 


RECEPTOR; KIR, NATURAL 
KILLER RECEPTOR, 1 
INHIBITORY RECEPTOR, 2 £ 
IMMUNOGLOBULIN 


IMMUNE SYSTEM P58 J 
NATURAL KILLER CELL J 
RECEPTOR; KIR, NATURAL t 
KILLER RECEPTOR, & 
INHIBITORY RECEPTOR, 2 g 
IMMUNOGLOBULIN f 


IMMUNE SYSTEM CD32; sJ 


RECEPTOR, FC, CD32, * 
IMMUNE SYSTEM f 


Is! 

1§ i 

Use 


n ■ 

! 

> 

I 

i 

! 


Compound 




I 
j 


MODULE M5 


(CONNECriN) ITNM 3 
(NMR, MINIMIZED 
AVERAGE STRUCTURE) 
1TNM4 1TNM58 


T-CELL SURFACE 
GLYCOPROTEIN CD4; 

ATKT. A 13. 


* 


ii 

1 

li 


1 

> 

1 


MHC CLASS INK CELL 
RECEPTOR PRECURSOR; 
CHAIN: A; 


MHC CLASS INK CELL 
RECEPTOR PRECURSOR; 
CHAIN: A; 


FC GAMMA RIIB; CHAIN: 
A; 


I 

c 


MOLECULE, LARGE 
ISOFORM;CHAIN:A; 




j 


score 




















* 


score 




0.17 


Ok 

© 


0.13 


0.24 


0.36 


0.21 


0.12 




> 


score 




0.15 


0.26 


0.35 


-0.13 


0.07 

i 


-0.03 


0.46 




* 


Blast 




*— c 

6 

CO 
CO 


Ov 

ON 


«t 

6 
so 


1.3e-10 


I 


3 

CO 


i-H 








«— 1 
CN 
1— < 




»— « 
CM 












START 
AA 




VO 

CO 


CO 
CO 








a 








a 










< 


< 


< 


< 




1 PDB 


a 




1 


Iwio 


lwit 


*o 

CN 


s 


2fcb 


3ncm 




SEQID 
NO: 




3 




CO 


CO 




S 







445 



WO 02/081731 



PCT/US02/01222 




446 



WO 02/081731 



PCT/US02/01222 




447 



WO 02/081731 



PCTAJS02/01222 
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O W 
U 3 



*8§ 




8§£ 




a 2 




i 

9 
O 

1 








S 2 



s 



8 



£? g 

^ 05 



2l 



a 



s 
I 

CO 



CM 

t 

OS 
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PDB annotation 










COMPLEX 

/tmuto rrrvo /mt rrv v a cm 


COMPLEX 

(INHIBITOR/NUCLEASE), 
COMPLEX (RI-ANG), 
HYDROLASE 2 MOLECULAR! 
RECOGNITION, EPITOPE f 
MAPPING, LEUONE-RICH 3 J 
REPEATS 3 


I|i | 


1 
c 

E 


Si 

! < a 


* 5 

§ 
I 


Compound 


N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


COMPLEXCTRANSCRIPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH2DRP 3 DNA2DRP4 


COMPUSX(TRANSOUPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 


PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 




2 5 <! w 




U2 RNA HAIRPIN IV; 

CHAIN:A,C;U2B ,, ; 
CHAIN: B.D; 


< 

c 


• «•» •* 
: "<« 

[SB 

! c?< m 


SEO FOLD 1 


score I 
















PMF 
score 




0.05 


0.51 




0.98 


-0.01 


0.96 




\ score ] 




0.17 


0.48 




0.20 


0.23 


0.05 . 




! Blast 1 




I 


1 




3 


6.6e-16 


VO 
»— * 


W 






% 


oo 
oo 
<o 




Pi 




m 


i 


* 




NO 


§ 

cn 












e 




< 


< 




< 


•< 


<< 






2drp 






f 


1 


c 


SEQED 
NO: 






§ 

CO 




CO 

en 


cs 
m 
m 


<s 
m 
m 
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o 

§ 

B 





81 




©^7 




1 
I 




< S < I 

fL, E PL* 





a 



s 



CO 



o 



is 

5? 



o 



2 



*3 



4 



CO 

to" 



o 



8 



Is 



is 

8 



i 1 



CO 
CO 



CO 
CO 



P5 

CO 
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PDB annotation 


BETA-NGF; COMPLEX, 
TRKA RECEPTOR, NERVE 
GROWTH FACTOR, 
CYSTEINE KNOT, 2 
IMMUNOGLOBULIN LIKE 


DOMAIN, NERVE GROWTH 1 
FACTOR/TRKA COMPLEX il 


ACETYLATION RNASE 
INHIBITOR, 

REBONUCLEASE/ANGIOGEN 
IN INHIBITOR 
ACETYLATION, LEUCINE- 
RICH REPEATS 


CELL ADHESION PROTEIN 
NCAM MODULE 2; CELL 
ADHESION, 


GLYCOPROTEIN, HEPARIN- 
BINDING, GPI-ANCHOR, 2 
NEURAL ADHESION 

MOLECULE. 


IMMUNOGLOBULIN FOLD, 
HOMOPHIUC 3 BINDING, 
CELL ADHESION PROTEIN 




OXIDOREDUCTASE g] 
METHYLENETHF j 
DEHYDROGENASE/ *j 
METHENYLTHFTHF, 1 
BIFUNCTIONAL, %\ 
DEHYDROGENASE, 
CYCLOHYDROLASE, %\ 
FOLATE, 2 fj 
OXIDOREDUCTASE 
HEADER „l 


i 

| wspg 


Compound 


TRKA RECEPTOR; CHAIN: 


«• 
< 


RIBONUCLEASE 
INHIBITOR; CHAIN: NULL; 


NEURAL CELL ADHESION 
MOLECULE, LARGE 
ISOFORM; CHAIN: A; 






% 


OFOLATE 

DEHYDROGENASE/ 
CHAIN: A, B; 


§§ 

9s 


SEQ FOLD 
score 














PMF 
score 




0.93 


0.80 




0.98 


1 0.63 


Verify 
score 




0.31 


0.33 




0.33 


0.22 


Psi 
Blast 




o> 




! 




0.00023 


c 








v3 

cn 






1— 1 

»o 


r-4 

to 


START 
AA 




8 


O 
«— « 




8 




CHAIN 
ID 






< ■ 




< 




la 




1 


I 

cn 










SEQ ID 
NO: 




cn 
cn 


cs 
cn 
cn 




CI 

m 


cn 
cn 
cn 
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PDB annotation 


SUBUNTT; GAMMAl, 
TRANSDUCIN GAMMA 
SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 


PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION <» 




COMPLEX 

(KINASE/INHIBITOR) CDK6; 
P19INK4D; CYCLIN 
DEPENDENT KINASE. 
CYCLIN DEPENDENT 
KINASE INHIBITORY 2 
PROTEIN, CDK, INK4, CELL 
CYCLE, COMPLEX 
(KINASE/INHIBITOR) 
HEADER HEUX 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN, 
CYCLIN-DEPENDENT 
KINASE, CELL CYCLE 2 
CONTROL, ALPHA/BETA, 
COMPLEX (INHIBITOR 1 j 
PROTEIN/KINASE) f! 


PHOSPHOTRANSFERASE 
PROTEIN KINASE 1CKI 18 * \ 


n 


OXYGEN TRANSPORT *? 
OXYGEN TRANSPORT, U\ 
HEME, RESPIRATORY £ ) 
PROTEIN, ERYTHROCYTE R | 


OXYGEN TRANSPORT 
OXYGEN TRANSPORT, f 
HEME, RESPIRATORY ? 
PROTEIN, ERYTHROCYTE t. 


OXYGEN TRANSPORT IH 
OXYGEN TRANSPORT, fi J 
HEME, RESPIRATORY Rl 


Compound 


C 






CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A, C; 


CYCLIN-DEPENDENT 
KINASE INHIBITOR; 
CHAIN: B. D: 




CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


1 CASEIN KINASE I DELTA; 1 


I 1CK36CHAIN:A,B;1CKI7 | 




i 

g 

S « 


\ 

c 


A 


HEMOGLOBIN; CHAIN: A, 




1 SEO FOLD 1 


score 
















135.78 


102.38 


PMF ! 


score 






0.03 


0.24 


0.17 




1.00 






Verify 
score 






-0.23 


-0.01 


-0.33 




0.65 








Blast 






0.00099 


0.0023 




1 0.0066 1 






I 


I 


oo 

1 

m 


I END 1 








9 


i 






i— » 
i— c 


i— c 


«— t 

Tf 
~* 


START 
AA 






S3 


cn 

OO 


cn 






»— < 




1 CHAIN 1 


e 






< 


< 


< 




< 


< 


A 


la 






OO 
JO 


Iblx 


lcki 








•% 


a 


NO: 






cn 


cn 


cn 
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2* 
Sn 
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a w 
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IT) 



S 
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CO 
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PDB annotation 


CHIMERA PROTEIN, 
RESPIRATORY PROTEIN, 
HEME 






OXYGEN 

STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE^ L HEMOGLOBIN. 


AVIAN, HIGH 2 
COOPERATHTY, OXYGEN 
TRANSPORT 


OXYGEN 

STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 
AVIAN, HIGH 2 
CfcOPERATHTY, OXYGEN 
TRANSPORT 


Ml s 

g5 S g o 

ipllls 

llsiiii 

oE3qS«:8p 






OXYGEN TRANSPORT X- IJ 
RAY STUDY, PORCINE Rf 
HEMOGLOBIN, ARTIFICIAL R! 


Compound 


BETA-ALPHA; CHAIN: A, 
B,C,D; 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 


IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B, D; 


si 

Ii 
II 

§i 


IHDA 3 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 
IHDA 3 


PORICINB HEMOGLOBIN 
(ALPHA SUBUNTT); 
CHAIN: A, C; PORICINB 


SEQFOLD 
score 




120.38 






148.19 


a 

vd 

ON 




136.61 




PMF 
score 






1.00 


1.00 






1.00 




LOO 


Verify 
score 






0.61 


0.93 






0.49 




0.78 


Psi 
Blast 




1 


i 

On 


l 

Ov 


I 

ON 


00 

I 


? 

cn 
cn 


cn 
cn 


% 

cn 
cn 






Tf 
r-t 






y-( 
Tf 
pH 


oo* 
cn 

r-4 






1-4 

i— 1 


START 
AA 




—4 


CN 








1—4 






CHAIN 
ID 








< 


< 


A 


<: 


< 


< 






Ihbh 


Ihbh 


Ihbr 


Ihbr 


Ihbr 


Ihda 


Ihda 


lqpw 


SEQID 
NO: 










cn 


cn 


to 


cn 


cn 
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PDB annotation 


HUMAN BLOOD, 2 OXYGEN 
TRANSPORT 


OXYGEN TRANSPORT X- 
RAY STUDY, PORCINE 
HEMOGLOBIN, ARTIFICIAL 
HUMAN BLOOD, 2 OXYGEN < 
TRANSPORT f 




OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN, ERYTHROCYTE 


OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN, ERYTHROCYTE 


1 OXYGEN TRANSPORT 1 


OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN. ERYTHROCYTE 


OXYGEN TRANSPORT 
HEME PROTEIN, MODEL 
COMPOUNDS, OXYGEN 
STORAGE, LIQAND 2 ( 
BINDING GEOMETRY, f ' 
CONFORMATIONAL 
SUBSTATES, OXYGEN 3 
TRANSPORT J 


OXYGEN TRANSPORT }) 
OXYGEN TRANSPORT \ [ 




t 

\ 

i 

Q 


3 

\ 

■> 


1 HEMOGLOBIN (BETA 


Q 
ffl 

S 

oo 


PORIONE HEMOGLOBIN 
(ALPHA SUBUNIT); 
CHAIN: A, C; PORIONE 
HEMOGLOBIN (BETA 
SUBUNIT); CHAIN: B.D 




HEMOGLOBIN; CHAIN: A, 
B 




1 HEMOGLOBIN; CHAIN: A, 1 




1 HEMOGLOBIN; CHAIN: A, | 


m 


MYOGLOBIN; CHAIN: 

XTTTT Y . 




HEMOGLOBIN; CHAIN: A, 
E.C.F; 


OXYGEN TRANSPORT 
HEMOGLOBIN 
THIONVlLLE ALPHA 
CHAIN MUTANT WITH 
VAL 1 IBAB 3 REPLACED 
BYGLUANDAN 
ACETYLATED MET 
BOUND TO THE IBAB 4 
AMINO TERMINUS IBAB 5 


SEQFOLD 
score 




133.84 






106.59 


87.72 




93.82 




I 


score 








00 

o 






0.49 




LOO 


Verify 
score 








0.03 






i 




0.38 


Psi 
Blast 




? 

cn 
cn 




I 

CO 
CO 


cn 

CO 


2 


00 

1 


m 

CO 












ON 

«— i 


On 


ON 


On 
*-» 


oo 


ON 


START 
AA 








I— 4 




ft 




cn 




CHAIN 
ID 




< 




< ■ 


< 


tt 




W 


< 


u 




Iqpw 










1 




lbab 


SEQ ID 
NO: 




cn 




9 

cn 


3 




59 

cn 


CO 


CO 
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PDB annotation 






OXYGEN 

STORAGE/TRANSPORT 
HEME, OXYGEN DELIVERY 
VEHICLE, BLOOD 
SUBSTITUTE 


OXYGEN TRANSPORT « 
OXYGEN TRANSPORT, J} 


J 

i £ 

^ o 
j ^ 


OXYGEN TRANSPORT £ 
OXYGEN TRANSPORT, j| 
CHIMERA PROTEIN, £ 
RESPIRATORY PROTEIN, « 
HEME I 1 




Compound 


OXYGEN TRANSPORT 
HEMOGLOBIN 
THIONVILLE ALPHA 


CHAIN MUTANT WITH 


VAL 1 IBAB 3 REPLACED 
BYGLUANDAN 
ACETYLATEDMET 
BOUND TO THE IBAB 4 
AMINO TERMINUS IBAB 5 


OXYGEN TRANSPORT 
HEMOGLOBIN 
THIONVILLE ALPHA 


CHAIN MUTANT WITH 
VAL1 IBAB 3 REPLACED 
BYGLUANDAN 
ACETYLATEDMET 
BOUND TO THE IBAB 4 
AMINO TERMINUS IBAB 5 


DEOXYHEMOGLOBIN 
(ALPHA CHAIN); CHAIN: 
A; DEOXYHbMOGLOBIN 
(BETA CHAIN); CHAIN: B, 
D; 


MODULE-SUBSTITUTED 
CHIMERA HEMOGLOBIN 
BETA-ALPHA; CHAIN: A, 
B,C,D; 


MODULE-SUBSTITUTED 
CHIMERA HEMOGLOBIN 
BETA-ALPHA; CHAIN: A, 
B,C,D; 


OXYGEN TRANSPORT 
MYOGLOBIN 
COMPLEXEDWITH 
CYANIDE IEMY 3 IEMY 
107 HEME PROTEIN, 
GLOBIN FOLD IEMY 5 


SEO FOLD ! 


score ! 


113.33 


86.57 






94.07 




1 PMF 1 


score 






0.25 


0.99 




0.95 


1 
* 


score 






0.23 


0.13 




0.44 


Psi 
Blast 


2e-40 


CO 
CO 


1 


1.6e-36 


VO 

1 


oo 

2 




OS 




*— 1 


o\ 

< 


os 


i-H 


START 
AA 


CO 


3 








CM 


Z 
>■ 


a 


< 


03 


< 


< 


•< 






Ibab 


Ibab 


i-H 


1 


i 


lemy 


£ 


NO: 


CO 


CO 


CO 


3 


CO 


CO 
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PDB annotation 










OXYGEN- s 
STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 
AVIAN, HIGH 2 
COOPERATHTY, OXYGEN 
TRANSPORT 


OXYGEN 

STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 1 ) 
AVIAN, HIGH 2 f ) 
COOPERATHTY, OXYGEN m J 
TRANSPORT A 


■V ft ,P»K rfUM «*•»» r# rfM 

Sftg w 

I§§ § 

s| o o 

iisiiii 

o5qS3<op 






n 

j 

t 


i 

s 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY, 
HUMAN FETAL F=/n$=) 
1FDHG1 1FDHH2 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B. D: 








0 1 

Q 

Si 

g 1 

n ' 


• *» 

3 
s 

5 Q 
1 w * 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 


! IHDA 3 I 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 


SEQFOLD 
score 


100.14 




101.11 
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PDB annotation 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) SREBP- 
1 A; STEROL REGULATORY 
ELEMENT BINDING kx 
PROTEIN, 2 BASIC-HELIX- ■ 
LOOP-HEUX-LEUONE T 
ZIPPER, SREBP, | 


TRANSCRIPTION 3 FACTOR, 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 








METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER V 
(C3HC4) fl 


^1 




TRANSFERASE HRS; HRS, £ * 
VHS, FYVE, ZINC FINGER, R! 
SUPERHEUX 


METAL BINDING PROTEIN fj 
RING FINGER PROTEIN J" 
MATl; RING FINGER *\ 
(C3HC4) ?V 




PATHOGENESIS-RELATED fl 


Compound 




STEROL REGULATORY 

ELEMENT RTNDTNrt 


PROTEIN 1A; CHAIN: A, B, 
C, D; DNA; CHAIN: E, F, G, 
H; 




VIRUS EQUINE HERPES 
VIRUS-1 (C3HC4, OR RING 
DOMAIN) 1CHC3 (NMR. 1 


STRUCTURE) ICHC 4 


VIRUS EQUINE HERPES 
VIRUS-1 (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, 1 


5 

T-H 

\ 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATl ; CHAIN: A; 




VIRUS EQUINE HERPES 
VIRUS-1 (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR,1 


1 


HEPATOCYTE GROWTH 
FACTOR-REGULATED 
TYROSINE CHAIN: A; 


CDK-ACTTVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 




PATHOGENESIS-RELATED I 
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PDB annotaUon 


STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL fj 
STRUCTURE, COMPLEX T 
(ZINC FINGER/DNA) 1 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 1 
FINGER/DNA) ZINC FINGER,f 
PROTEIN-DNA m 
INTERACTION, PROTEIN %a 
DESIGN, 2 CRYSTAL J 
STRUCTURE. COMPLEX N 
(ZINC FINGER/DNA) U] 


COMPLEX (ZINC G 
FINGER/DNA) ZINC FINGERfj. 
PROTEIN-DNA \ 
INTERACTION, PROTEIN p 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX r 
(ZINC FINGER/DNA) II 


COMPLEX (ZINC -fl, 
FINGER/DNA) ZINC FINGERfi 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
1 PROTEIN; CHAIN: C, F, G; 




1 DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


1 DNA; CHAIN: A, B, D, E; 
| CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C F, G; 
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PDB annotation 


(ZINC FINGER/DNA) [ 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL A 
STRUCTURE, COMPLEX fl 
(ZINC FINGER/DNA) 7 


REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 


POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN T 


COMPLEX (TRANSCRIPTION*! 
REGULATION/DNA) M 
COMPLEX (TRANSCRIPTIOI^ 
REGULATION/DNA), RNA 1 
POLYMERASE m, 2 C 
TRANSCRIPTION W 
INTTIATION, ZINC FINGER g 
PROTEIN ft 


COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) p 
COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA), RNA P 


POLYMERASE HI, 2 0= 
TRANSCRIPTION ft 
INITIATION, ZINC FINGER p 


t 
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i 

> 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


0 
< 


~\ g 

^ < tvr 

s « « 

'IS 


TFH1A; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFE1A; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 




TFmA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


SEQFOLD 
score 
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PDB annotation 


I PROTEIN ! 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 . 
TRANSCRIPTION 1 
INITIATION, ZfNC FINGER 1 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION . 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
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Compound 




TFHIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


GO § 

*° m 
QO 

|f| 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


SEQFOLD 
score 
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score 
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PDB annotation 


ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (DNA-BINDING J 
PROTEIN/DNA) FIVE- M 
FINGER GLI; GLI, ZINC ^ 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GU, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) pi 


COMPLEX (DNA-BINDING J \\ 
PROTEIN/DNA) FIVE- Ij 
FINGER GU; GU, ZINC 1 
FINGER, COMPLEX (DNA- K 
BINDING PROTEIN/DNA) W 


0 


| TRANSPORT PROTEIN TC4;fy 
1 GTPASE, NUCLEAR 
TRANSPORT, TRANSPORT q 

protein 5t 


J TRANSPORT PROTEIN TC4;C! 
j GTPASE, NUCLEAR IV 
1 TRANSPORT, TRANSPORT fU 
PROTEIN m 


Compound 




ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C,D; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C, D; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 




GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


SEQFOLD 
score 






99.97 
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score 
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PDB annotation 


(GTPASE 

ACTIVATION/PROTO- 
ONCOGENE), GTPASE, 2 
TRANSITION STATE, GAP ! 


COMPLEX (GTP- 
BINDING/EEFECTOR) RAS- 
RELATED PROTEIN RAB3A; 
COMPLEX (GTP- j 
BINDING/EFFECTOR), G ' I 
PROTEIN, EFFECTOR, 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS.RAB 
PROTEIN, RAB3A, 
RABPHILIN 


COMPLEX (GTP- 
BINDING/EFFECTOR) RAS- 
RELATED PROTEIN RAB3A; 
COMPLEX (GTP- 
BINDING/EFFECTOR), G 
PROTEINJEFFECTOR, 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS.RAB 
PROTEIN, RAB3A, 
RABPHILIN X 


PROTEIN BINDING EP-G; EFf 
G ELONGATION FACTOR, „ 
TRANSLOCASE, RIBOSOME,, 
ELONGATION, 2 2 
TRANSLATION, PROTEIN G 
SYNT FACTOR, GTPASE, U 
GTP BINDING, 3 g 
GUANOSINE NUCLEOTIDE R 
BINDING., PROTEIN 
BINDING * 


HYDROLASE G PROTEIN, f 
VESICULAR TRAFFICKING, H 
GTP HYDROLYSIS, RAB 2 R= 
PROTEIN, H 
NEUROTRANSMITTER m 


Compound 




RAB-3 A; CHAIN: A; 
RABPHBLJN-3A; CHAIN: B; 


RAB-3A; CHAIN: A; 
RABPHILIN-3A; CHAIN: B; 


ELONGATION FACTOR G; 
CHAIN: A; ELONGATION 
FACTOR G DOMAIN 3; 
CHAIN: B; 
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PDB annotation 


RIBONUCLEOPROTEIN Al, 
NUCLEAR PROTBIN, . 
HNRNP, RBD, RRM, RNP, 
RNA BINDING, 2 
RIBONUCLEOPROTEIN 


RNA BINDING PROTEIN 
RNA-BINDING DOMAIN 




RNA-BINDING DOMAIN 
RNA-BINDING DOMAIN, 
ALTERNATIVE SPLICING 


COMPLEX 

(RIBONUCLEOPROTEIN/DN 
A)HNRNPA1,UP1; T 
COMPLEX f | 
(RIBONUCLEOPROTEIN/DN 
A), HETEROGENEOUS , 1 
NUCLEAR 2 JJ; 
RIBONUCLEOPROTEINAl ^il 




RNA-BINDING CI 
PROTEIN/RNA TRA PRE- f\ 
MRNA; SPLICING \ 
REGULATION, RNP m\ 
DOMAIN, RNA COMPLEX ? ' 


RNA BINDING PROTEIN £ ' 
RNA-BINDING DOMAIN M ! 


RNA BINDING PROTEIN Ri 
RNA-BINDING DOMAIN fii 


Compound 




HETEROGENEOUS 
NUCLEAR 

RIBONUCLEOPROTEIN DO; 
CHAIN: A; 


RNA-BINDING PROTEIN 
SEX-LETHAL PROTEIN (C- 
TERMINUS, OR SECOND 
RNA-BINDING DOMAIN 
ISXL 3 (RBD-2), RESIDUES 
199 -294 PLUS N- 
TERMMALMET) ISXL 4 
(NMR, 17 STRUCTURES) 
ISXL 5 


SEX-LETHAL PROTEIN; 
CHAIN: NULL; 


HETEROGENEOUS 
NUCLEAR 

RIBONUCLEOPROTEIN Al; 
CHAIN: A; 12- 
NUCLEOTIDE SINGLE- 
STRANDED TELOMETRIC 
DNA; CHAIN: B; 
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PDB annotation 


TRANSCRIPTION FACTOR 
P65; P50D; TRANSCRIPTION 
FACTOR, IKB/NFKB 
COMPLEX 




illi 


ANK-REPEAT 
MYOTROPHIN, 


ACETYLATION, NMR, ANK- j 
REPEAT 1 


COMPLEX (TRANSCRIPTION 
REG/ANK REPEAT) 
COMPLEX (TRANSCRIPTION 
REGULATION/ANK 
REPEAT), ANKYRIN 2 
REPEAT HELIX 


TRANSCRIPTION 

REGULATION 

TRANSCRIPTION 


REGULATION, ANKYRIN 


P 

j 


COMPLEX (ANTI- fj 
ONCOGENE/ANKYRIN 
REPEATS) P53BP2; 7* 
ANKYRIN REPEATS, SH3, 
P53, TUMOR SUPPRESSOR, *gfl 
MULTIGENE 2 FAMILY, P 1 
NUCLEAR PROTEIN, 0 
PHOSPHORYLATION, ftj 
DISEASE MUTATION, 3 \ 
POLYMORPHISM, COMPLEX* 
(ANTI- J* 
ONCOGENE/ANKYRIN jf 
REPEATS) fU 


Si 


COMPLEX (TRANSCRIPTION! 


Compound 


NF-KAPPA-B P6S 
SUBUNTT; CHAIN: A; NF- 
KAPPA-B P50D SUBUNTT; 
CHAIN: C; I-KAPPA-B- 
ALPHA; CHAIN: D; 


MYOTROPHIN; CHAIN: 
NULL 


MYOTROPHIN; CHAIN: 
NULL 


NF-KAPPA-B P65; CHAIN: 
A, C; NF-KAPPA-B P50; 
CHAIN: B, D; I-KAPPA-B- 
ALPHA; CHAIN: B,F; 


REGULATORYPROTEIN 
SWI6;CHAIN:A,B; 


P53; CHAIN: A;53BP2; 
CHAIN: B; 




GA BINDING PROTEIN 


h 
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PDB annotation 


FACTOR P18-INK4C; CELL 
CYCLE INHDBTTOR, 
P18INK4C, TUMOR. 
SUPPRESSOR, CYCLIN- 2 
DEPENDENT KINASE, 
HORMONE/GROWTH 
FACTOR 


HORMONE/GROWTH k 
FACTORP18-INK4C; CELL ' 1 
CYCLE INHIBITOR, 
P18INK4C, TUMOR, 
SUPPRESSOR, CYCLIN- 2 
DEPENDENT KINASE. 
HORMONE/GROWTH 
FACTOR 


CELL CYCLE INHIBITOR 
P18-INK4C(INK6); CELL 
CYCLE INHIBITOR, P18- 
INK4C(INK6), ANKYRIN 
REPEAT, 2 CDK 4/6 
INHIBITOR 


CELL CYCLE INHIBITOR 
P18-INK4C(INK6); CELT. 
CYCLE INHIBITOR, P18- % 
INK4C(INK6), ANKYRIN f 
REPEAT, 2 CDK 4/6 J 


1 


i 

§£23 


TRANSCRIPTION FACTOR \ 
P65; P50D; TRANSCRIPTION^ 
FACTOR, IKB/NFKB S 
COMPLEX {? 

ft 


4^ 

If 


Compound 


KINASE 6 INHIBITOR', 


if 


CYCLIN-DEPENDENT 
KINASE 6 INHIBITOR; 

rrrr a y*y_ a - 
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9*><f 
s m ^ 

III 


CYCLIN-DEPENDENT 
KINASE 6 INHIBITOR: 


CHAIN: A, B; 


i 

It 

*l 


KAPPA-B P50D SUBUNTT; 
CHAE*C;I-KAPPA-B- 
ALPHA; CHAIN: D; 


NF-KAPPA-B P65 
SUBUNTT; CHAIN: A; NF- 
KAPPA-B P50D SUBUNTT; 
CHAIN: C; I-KAPPA-B- 
ALPHA; CHAIN: D; 
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PDB annotation 


I PROTEIN, RECEPTOR I 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl; 
FGFRl; IMMUNOGLOBULIN 
(IG) LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LDCE 


1 DOMAINS, B-TREFOIL FOLD k 


CONTRACTILE PROTEIN } 1 
IMMUNOGLOBULIN FOLD, 
BETA BARREL 


I 

ft- 


CONNECTIN, NEXTM5; 
CELL ADHESION, 
GLYCOPROTEIN, 
TRANSMEMBRANE, 
REPEAT, BRAIN, 2 
IMMUNOGLOBULIN FOLD, 
ALTERNATIVE SPLICING, 
SIGNAL, 3 MUSCLE 
PROTEIN 




GLYCOPROTEIN CD4; Jf 
IMMUNOGLOBULIN FOLD, M 
TRANSMEMBRANE, Pf 
GLYCOPROTEIN, T-CELL, 2 Q 
MHC LIPOPROTEIN, ffl 
POLYMORPHISM \\ 


'It 


ft". 

c/: 

B 
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ii 
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IS 


CHAPERONS BETA SHEETS! y 
SHORT HELICES FU 
MOLECULAR CHAPERONE RJ 


Compound 




FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


TELOKIN; CHAIN: A 






1 MUSCLE PROTEIN TITIN 1 


MODULE MS 
(CONNECTIN) ITNM 3 
(NMR, MINIMIZED 
AVERAGE STRUCTURE) 
ITNM 4 ITNM 58 


T-CELL SURFACE 
GLYCOPROTEIN CD4; 
CHAIN: A, B; 

i 




HEAT SHOCK PROTEIN 40; 
CHAIN: A; 


HEAT SHOCK PROTEIN 40; 
CHAIN: A; 

HUMAN HSP40; CHAIN: 


SEQFOLD 
score 




















PMF 
score 




0.57 . 


1.00 


0.99 


0.96 


0.46 




1.00 


1.00 
1.00 


Verify j 
score 




0.46 


0.19 


0.53 


0.92 


i 




0.34 


vq ov 
o d 


Psi 
Blast 




I 


CO 

vd 


oo 
sd 


i 

00 

vd 


vd 




CO 

J 

CO 
CO 


CO ON 

2 I 

vd c4 


§ < 




VO 
CO 


00 
CO 


VO 
CO 


vo 

CO 


00 
CO 
t— 1 




o 

H 

CO 


o 

cn r** 


START 
AA 
















§ 




CHAIN 
ID 




U 


< 






< 




< 


< 


M 




levt 


lfllg 


Inct 






1 




00 

2 


00 

-8 2 


SEQID 
NO: 








o 




1 




oo 
o 


00 oo 

o © 



531 



WO 02/081731 



PCT/US02/01222 



3 



i 



.as 



§8 



m 

ill 



I, 



OOl 



d8 



^ ^ ^ 




e s s g § £ 



.PCI'/ 





P I 



,^ go 



8i 



q ! 



i 

! 



..go 

si* 

|b8 

a w at 



ip 5 



s 8g 



Jao 

ill 



< g o 

§8g 



w 



5 



._ * 

5 5 ^ 



£85 



*S2 

Z 59 n 



CO 



8 
8 



o 
o 



NO 



8 



00 

o 



;3 



a 



la 



CO 



3 



00 

o 



<N 



s 



o 



i 



o 



532 



WO 02/081731 



PCT/US02/01222 



I 

o 



>* u 

eg 

J 



I 



y 




9<! fo^ 





8 

Q 



y 



Pflplf 

u E 2 S a 55 1 



or egii 

* « S S y 

Owe 





i 
i 





S8S 




s 8g 




*8g 




S88 



I 



ON 
ON 



o 
o 



8 



8 



8 

d 



zl 



1 



:3 



CO 



CO 
CO 



CO 



a 



o 



o 



a 



© 



! 



O 



533 



WO 02/081731 



PCT/US02/01222 



a 

I 

© 

s 
s 

« 

pa 

e 




§ z * u 




BUrII" 





T3 

i 
I 



00 I 






5 8g 



1 



9 



o g 



§ 



cn 
»— i 

d 



z] 



|4 

3 



a 

CO 



1 



O 
»— » 



o 



CN 



U 



o 



534 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


1? 
i 


la 

if 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION J 
INITIATION, ZINC FINGER 1 


1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INTTIATION, ZINC FINGER 




COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION *Q 
INITIATION, ZINC FINGER f=j 


i 


COMPLEX (TRANSCRIPTION^! 
REGULATION/DNA) N 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 6! 1 
POLYMERASE m, 2 jB 
TRANSCRIPTION RJ 
INITIATION, ZINC FINGER % 


1 

i 


w < 

1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA fij 
POLYMERASE ELI, 2 m 


Compound 




TFIHA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE: 


CHAIN: B,C,E,F; 


TFIIIA; CHAIN: A, D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFIIIA; CHAIN: A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFIIIA; CHAIN: A, D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C.E,F; 


TFHIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 
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PDB annotation 


PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSORIPTION . 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 d\ 
FINGER PROTEIN, DNA- W 
PROTEIN RECdGNHlON, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) - 1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YI^O- 
YANG 1; TRANSCRIPTION" 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) ! 


/ PC" 

Mil 

Pill 


COMPLEX (DNA-BINDING % 1 
PROTEIN/DNA) FIVE- J 
FINGER GLI; GLI, ZINC h| 
FINGER, COMPLEX (DNA- Of 
BINDING PROTEIN/DNA) O 


e/oi 

ill 11 

HP 

8 §S n n § 


EtjBEI 
o 

Hi 
III 


Compound 




YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 : 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT ' 
DNA; CHAIN: A, B; 


ZINC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 


H 5 Q 

I 


ZINC FINGER PROTEIN 
GLU; CHAIN: A; DNA; 
CHAIN: C,D; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C, D; 


SEQFOLD 

score 
















PMF 
score 




0.33 


0.72 


1.00 


0.99 


fH 

d 




Verify 
score 




9 


-0.24 


0.49 


0.16 


S 

9 




Psi 
Blast 




o 

4' 

CO 


? 

»-h 

«n 


<? 

4 

CO 


3.4e-34 


CO 

v-i 


f-H 






8 

»-H 


Y-H 

ON 


§ 


VO 
00 

«<l 


rH 


CO 


START 
AA 






8 


«-H 


00 




a 

CO 




















g a 




u 


u 






< 




< 


la 




lubd 


lubd 
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PDB annotation 


FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GU, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING jk 
PROTEIN/DNA) FIVE- |D 
FINGER GU; GU, 23NC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 




CALCIUM-BINDING 
PROTEIN 2A9.CACY, 
S100A6, PRA; CALCIUM- 
BINDING PROTEIN, EF- 
HAND, S-100 PROTEIN, NMR 


CALOUM/PHOSPHOLIPID 
BINDING PROTEINPU, 
CALPACTIN UGHT CHAIN; 
SlOO FAMILY, EF-HAND 
PROTEIN, UGAND OF 
ANNEXING 2 ^ 
CALCIUMPHOSPHOLIPID fj 
BINDING PROTEIN _j 


METAL BINDING PROTEIN M 
S100B,S100BETA; Jil 
S100BFTA, S100B, NMR, W 
DIPOLAR COUPLINGS, EF- 
HAND, SlOO 2 PROTEIN, 0 
CALCIUM- BINDING RJ 
PROTEIN, FOUR-HEUX \ 
BUNDLE, THREE- 3 g 
DIMENSIONAL STRUCTUWg 
SOLUTION STRUCTURE 


CALCIUM-BINDING IV 
PROTEIN SNTNC; CALCIU^J 
BINDING, REGULATION, HI 


Compound 




ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C,D; 






« 

o 
o 

»— 1 
CO 


S-100 PROTEIN, BETA 
CHAIN; CHAIN: A,B; 


| N-TROPONIN C; CHAIN: " 
NULL; 


SEQFOLD 
score 










70.10 


87.95 


84.29 




PMF 
score 




1.00 


0.90 










i 


f I 

> 8 




0.45 


8 
© 










0.02 


Psi 
Blast 






le-32 




oo 

r— 1 

i 


•2 

cn 
cn 


OO 

>© 


r-t 






o 


00 
»— 1 






so 

ON 


Si 


3 


START 
AA 






oo 




1—1 






t 


CHAIN 
ID 




< 


< 




<: 


< 


< 








2gli 


2gji 




la03 


f 

1-4 


s 

T-H 


lblq 


SEQID 
NO: 
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PDB annotation 


CONNECTIN, FEBRONECTIN 
TYPEIII 


CONNECTIN A71, 


j 

Hi 


5 

J 


SIGNALING PROTEIN 
CYTOKINE RECEPTOR, 
GLYCOPROTEIN 130, GP130, 
INTERLEUKINE 6 2 


RECEPTOR BETA SUB UNIT, 
SIGNALING PROTEIN 


1 SIGNALING PROTEIN I 


CYTOKINE RECEPTOR, 
GLYCOPROTEIN 130, GP130, 
INTERLEUKINE 6 2 
RECEPTOR BETA SUBUNIT, 
SIGNALING PROTEIN 


1 MEMBRANB PROTEIN BETA 1 


SANDWICH, CYTOKINE ?0 
RECEPTOR, FN3 DOMAIN ^ 


;t/wbos/o. 


BINDING PROTEIN BINDING 
PROTEIN, CYTOKINE RJ 
RECEPTOR m 


El 

1 


Compound 




TTTIN; CHAIN: NULL; 




TTTIN; CHAIN: NULL; 


d 
cn 
*-» 

& 


♦ » 
PQ 

< 
O 

CO 
i-H 




1 CYTOKINE RECEPTOR 1 


COMMON BETA CHAIN; 
CHAIN: A; 


NEURAL ADHESION 
MOLECULE DROSOPHELA 
NEUROGLIAN 
(CHYMOTRYPTIC 
FRAGMENT CONTAINING 
THE 1CFR 3 TWO AMINO 


PROXIMAL FIBRONECnN 
TYPE EI REPEATS ICFB 4 
(RESIDUES 610 - 814)) 
ICFB 5 






I ERYTHROPOIETIN; 


SEQFOLD 
score 








52.35 

I 






64.96 






1 


score 




0.28 


0.28 




0.37 


0.15 




0.24 


0.58 


Verify 
score 




0.12 


-0.10 




0.04 


0.09 




-0.31 


-0.05 


Psi 
Blast 




00 

f—» 

6 

cn 
c4 


i-H 

f— i 


NO 
f-H 


»— » 


f-H 

A 

cn 

f-H 


I 


r-t 


9.9e47 






Ov 
1— i 


f«H 


*- 1 
cn 
cs 


*-H 


s 


8 


s 

f-H 


f-H 
f-H 


START 
AA 






CN 

cn 






00 . 
«— t 


00 
1— • 


f-H 


OO 
»-H 




B 








•< 


< 


< 






« 






Ibpv 


Ibpv 


1 


Ibqu 


t 


Icfb 


lcto 


leer 


SEQID 
NO: 




m 
»— * 


cn 

f-H 


cn 


cn 

f-H 


cn 


cn 
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cn 


cn 
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PDB annotation 


(CYTOKINE/RECEPTOR) 
EPOBP; ERYTHROPOIETIN, 
ERYTHROPOIETIN 


RECEPTOR, SIGNAL 2 
TRANSDUCTION, 
HEMATOPOIETIC 
CYTOKINE, CYTOKINE 
RECEPTOR 3 CLASS 1, gk 
COMPLEX O 
(CYTOKINE/RECEPTOR) | 




CELL ADHESION PROTEIN 
RGD, EXTRACELLULAR 
MATRIX IFNF 18 


HEPARIN AND INTEGRIN 
BINDING HEPARIN AND 
INTEGRIN BINDING 


HEPARIN AND INTEGRIN 
BINDING HEPARIN AND 
INTEGRIN BINDING 


CELL ADHESION PROTEIN ^ 
CELL ADHESION PROTEIN, p? 


ROD, EXTRACELLULAR Lj 
MATRIX, 2 HEPARIN- P 
BINDING, GLYCOPROTEIN ^ 


rrai. ADHESION PROTEIN l M 
CELL ADHESION PROTEIN, W 


BE/ 


CELL ADHESION PROTEIN q 
CELL ADHESION PROTEIN,** 
RGD, EXTRACELLULAR 
MATRIX, 2 HEPARIN- fU 
BINDING, GLYCOPROTEIN fU 


I STRUCTURAL PROTEIN fill 


Compound 


CHAIN: A; 


RECEPTOR; CHAIN: B,C; 




CELL ADHESION PROTEIN 
FIBRONECTIN CELL- 
ADHESION MODULE TYPE 
HMO 1FNA3 


FIBRONECTIN; IFNF 6 


"v. 
-H 


FIBRONECTIN; CHAIN: A; 


FIBRONECTIN; CHAIN: A; 


FIBRONECTIN; CHAIN: 
NULL: 




FBRONECTIN; CHAIN: 
NULL; 




FBRONECTIN; CHAIN: 
NULL; 




INTEGRIN BET A-4 


SEQFOLD 
score 






92.46 


81.90 




cn 
cn 

s 






VO 

VO 


PMF 
score 




0.69 






0.19 




0.07 


0.64 




Verify 
score 




-0.27 






-0.65 




-0.02 


-0.40 




Psi 
Blast 




00 


CN 

1 

oo 


le-27 


m 
»— i 

6 


i-H 


J 

OA 

o\ 


1.7e-14 










CN 
VO 

cn 


cn 


VO 

o 




«-H 


T-H 


1— t 
cn 
CN 


START 
AA 






cs 




cn 


»— < 
CM 


a 




r-l 


| CHAIN 
ID 








< 


< 










la 




a 

«-H 


31 

• 


i 

f-H 


Ifoh 






Imfh 


1 ' 


i-h 


SEQID 
NO: 
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»— 1 


cn 
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PDB annotation 


EnTEORIN, 
HEMIDESMOSOME, 
FIBRONECTIN, 
CARCINOMA, STRUCTURAL 
2 PROTEIN 


STRUCTURAL PROTEIN 


i 


FIBRONECTIN, 11 
CARCINOMA, STRUCTURAL T 
2 PROTEIN 1 


STRUCTURAL PROTfilN 
TENASCIN, FIBRONECrnN 
TYPE-IE, HEPARIN, 
EXTRACELLULAR 2 
MATRIX, ADHESION, 
FUSION PROTEIN, 


5 

i 
£ 


STRUCTURAL PROTEIN 
TENASCIN, FIBRONECTIN 
TYPE-DL HEPARIN, 


EXTRACELLULAR 2 
MATRIX, ADHESION, ! 
FUSION PROTEIN, 
STRUCTURAL PROTEIN ^ 


■ 


iiii 

g g JZ PQ 

s § § £ 


; 3l .3L 12^ fS f& x 


Compound 


§ 

CO 


INTEGRIN BET A-4 1 


< 

§ 

CO 


TENASCIN; CHAIN: A, B; 


« 

< 

| 




GLYCOPROTEIN 
FIBRONECTIN (TENTH 
TYPE HI MODULE) (NMR, 
36 STRUCTURES) 1TTF 3 


FIBRONECTIN; CHAIN: A; 


*! 

ll 


ffl 

: it 

1 §1 

1 

2 p o « 


SEQFOLD 
score 






70.56 










PMF 1 


score | 




0.35 




0.28 


d 


0.95 


0.04 


Verify 
score 




0.16 




0.00 


-0.29 


0.29 


•0.68 




Blast 






oo 

A 

CO 


3.4e-18 




m 
i—i 

A 












£ 


9 


s 

1—1 


o 
i— » 

l-H 


ON | 


START 
AA 












<s 
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s 
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M 
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%' 
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i-H 
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PDB annotation 


FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 








UGASE CBL, UBCH7, ZAP- 
70, E2, UBIQUITIN,E3, 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 


DEGRADATION, 


METAL BINDING PROTEIN 
RING FINGER PROTEIN ^ 
MATl; RING FINGER fj 
(C3HC4) li 


DNA-BINDING PROTEIN ' 1 
V(D)J RECOMBINATION Jjl 
ACTIVATING PROTEIN 1; V 
RAG1,V(D)J Pr 
RECOMBINATION, Q 
ANTIBODY, MAD, RING fy 
FINGER, 2 ZINC BINUCLBAR, 
CLUSTER, ZINC FINGER, ~ 
DNA-BINDING PROTEIN H 


Li 


KINASE KINASE, SIGNAL IV 
TRANSDUCTION, 
CALCIUM/CALMODULIN 01 


Compound 




COMPLEXCTRANSCRIPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLBXED 
WITH 2DRP 3 DNA 2DRP 4 




VIRUS EQUINE HERPES 
VIRUS-1 (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, 1 


STRUCTURE) ICHC 4 | 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUTTIN- 
CONJUGATING ENZYME 
E12-18 KDA UBCH7; 
CHAIN: C; 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 


»— 

C 
< 


1 

> 

1 




CALCIUM/CALMODULIN- 1 


DEPENDENT PR.OTEIN 


KINASE; CHAIN: NULL; | 


SEQFOLD 
score 


















110.80 


PMF 
score 




0.03 




0.66 


0.48 


0.40 


0.05 






li 




0.48 




-0.12 


-0.52 


0.50 


VO 
CO 
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Blast 
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CO 
CO 
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? 

i-H 


CO 

6 

CO 
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vO 


vb 

UO 
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00 
ON 
CO 


START 
AA 




CO 




CN 
i— » 


VO 




CO 






CHAIN 
ID 




<: 








< 








la 




2dip 




Ichc 


Ifbv 




Irmd 
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NO: 
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PDB annotation J 


INHIBITOR ENTEROKIN ASE, 
HEAVY CHAIN; 
ENTEROKIN ASE, LIGHT 


i § 

5 S 
|i| | 


m — 


COMPLEX 

(PROTEASE/INHIBITOR) 
TRYPSIN, COAGULATION 
FACTOR XA, CHIMERA, 
PROTEASE, PPACK, 2 


1 - 

o ^ 

s 

111 


COMPLEX 

(PROTEASE/INHIBITOR) 
TRYPSIN, COAGULATION 
FACTOR XA, CHIMERA, *g 
PROTEASE, PPACK, 2 Z 


1 § 
55 G 

III 




■"'BIS 

W CO 

SB 
|| 


Si 


Compound 


CHAIN: A; 

ENTEROPEPTIDASE; 
CHAIN: B; VAL-ASP-ASP- 
ASP-ASP-LYS PEPTIDE; 
CHAIN: C; 


HYDROLASE (SERINE 
PROTEASE) PORCINE E- 
TRYPSIN(E.C3A2L4) 
1EPT3 


COAGULATION FACTOR 
XA-TRYPSIN CHIMERA; 
CHAIN: A; D-PHE-PRO- 
ARG- 

CHLOROMETHYLKETONE 


(PPACK) WITH CHAIN: I; 


COAGULATION FACTOR 
XA-TRYPSIN CHIMERA; 
CHAIN:A;D-PHEJ>RO- 
ARG- 

CHLOROMETHYLKETONE 


! 


HYDROLASE (SERINE 
PROTEINASE) GAMMA- 
♦CHYMOTRYPSIN *A 
(E.C.3.4.21.1)($P*H7.0) 
1GCT3 


GAMMA CHYMOTRYPSIN; 


if 1 

j i « i 0* 


SEQFOLD 
score 






155.00 




125.17 




PMF 
score 




0.65 




1.00 




0.94 


Verify 
score 




-0.81 




0.53 




0.27 


Psi 
Blast 




*o 


00 

I 

»— i 


oo 
oo 


1.5e-78 


I 

OO 

vd 


s < 
w < 






s 


a 


a 


3 


START 
AA 




OS 
»— t 


oo 


o\ 




ON 
i— 1 


CHAIN 
ID 




< 


< 


< 


< 


m 






lept 


lfxy 


I— 1 


a 
f— i 


vo 

00 

oo 

i-H 


SEQID 
NO: 




r- 

i-H 






r- 
■» 


r- 

i-H 
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PDB annotation 


SERINE PROTEASE SERINE 
PROTEASE, HYDROLASE, 
MAST CELL, ANGIOTENSIN, 
ALPHA 2 

TOLUENESULFONIC ACID 


. mm 




1 SERINE PROTEINASE 1 


SERINE PROTEINASE, 
GLYCOPROTEIN 


COMPLEX (BLOOD 
COAGULATION/INHIBITOR) 
CHRISTMAS FACTOR; 
COMPLEX, INHIBITOR, 
HEMOPHILIA/EGF, BLOOD fj 
COAGULATION, 2 PLASMA,^ 


«§* 

1231 
|||g 


HYDROLASE Q 
MICROPLASMiNOGEN, fy 
SERINE PROTEASE, \ 
ZYMOGEN, pi 
CHYMOTRYPSIN 2 FAMILYg 
HYDROLASE ^ 


GROWTH FACTOR 7S NGF; fM 
GROWTH FACTOR (BETA- J^J 
NGF), HYDROLASE - SERIN^jj 


§ 

o 

I 


CHYMASE; CHAIN: NULL; 


COMPLEX(PROTEINASE/I 
NHIBITOR) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 
WITH INHIBITOR FROM 
BITTER IMCT 3 GOURD 
IMCT 4 


COMPLEX(PROTEINASE/I 
NHIBITOR) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 
WITH INHIBITOR FROM 
BITTER IMCT 3 GOURD 
IMCT 4 


1 NBUROPSIN; CHAIN: A, B; 1 




FACTOR IXA; CHAIN: C, 
L,; D-PHE-PRO-ARG; 
CHAIN: I; 


i 

1 G 
E « 


NERVE GROWTH 
FACTOR; CHAIN: A, B, G, 
X.Y.Z; 


SEQFOLD 
score 


125.55 


j 173.18 




139.50 


115.05 


114.05 


119.87 


PMF 
score 






1.00 










Verify 
score 






0.76 










Psi 
Blast 


s 


o 


o 


oo 

& 

«n 


3.4e-77 


le-81 


3 


is 




8 


»— i 


OS 




i-H 


r-H 


START 
AA 




00 
1-4 


On 

1-4 




00 






CHAIN 
ID 




< 


< 


< 


u 


< 


< 


M 


lklt 


I 


1 


lnpra 


t 


Iqrz 


isgf 


SEQID 
NO: 


r- 
5 


—t 

xr 


r- 
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PDB annotation 






COMPLEX (SERINE 
PROTEASE/COAGULATION) 
COMPLEX (SERINE 
PROTEASE/COAGULATION), 
SERINE, PROTEASE, 2 
THROMBIN ^ 


o o 

iilil 

» U w U O 

lllip 

3 S 8 e B B 


COMPLEX (SERINE Q 
PROTEASE/PEPTIDE) fy 
FIBRINOPEPTIDE-A, 
COMPLEX (SERINE 
PROTEASE/PEPTIDE), 2 W 
THROMBIN " 


ifljjt «$sttx, us m 

/ 

1 


Compound 


HYDROLASE (SERINE 
PROTEINASE) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 
WITH THE INHIBITOR 
1TRN3DHS0PR0PYI^ 
ELUOROPHOSPHOFLUORI 
DATE (DEP) ITRN 4 
HUMAN TRYPSIN, DEP 
INHIBITED 1TRN6 


HYDROLASE (SERINE 
PROTEINASE) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 


WITH THE INHIBITOR 
ITRN 3 DHSOPROPYL- 
FLUOROPHOSPHOFLUORI 
DATE (DFP) ITRN 4 
HUMAN TRYPSIN, DFP 
INHIBITED ITRN 6 


THROMBIN; CHAIN: L, H, 
E,J,K,M,N; 
FIBRINOPEPTIDE A- 
ALPHA; CHAIN: F, G, I; 


sf 

is 

Em' 


►H 

1 
Si 


ALPHA THROMBIN; 
CHAIN: L,H;EPSILON 
THROMBIN; CHAIN: J, K, 
M; FIBRINOPEPTIDE A- 
ALPHA; CHAIN: F,N; 


SERINE PROTEASE 
GAMMA-THROMBIN 2HNT 


SEQFOLD 
score 


177.94 












PMF 
score 




1.00 


0.58 


0.98 


0.59 


0.05 


Verify 
score 




0.69 


-0.44 


0.11 


-0.47 


OS 


Psi 
Blast 


L7e- 
100 


4 8 

*-H 1-1 


i-H 

vS 


3.4e-34 


A 






a 


»— 1 


00 




oo 
en 


o 


START 
AA 


00 
rH 


OS 






§ 




CHAIN 
ID 


< 




W 


w 


s 


U 


PDB 
m 


§ 






S* 

9 


lycp 


2hnt 


SEQID 

NO; 


) 

5 . 


r- 


5 




5 . 


5 
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PDB annotatiton 

DIELS-ALDER, 
IMMUNOGLOBULIN 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN, FAB 
COMPLEX, IDIOTOPB, ANTI- 
BDIOTOPE 


IMMUNOGLOBULIN, FAB 
COMPLEX, IDIOTOPE, ANTI- 
IDIOTOPE 


C 

slii \ 

2 5 § » 5 

lill 


OBULIN), BLOOD H 
COAGULATION TYPE 3 2B «"f 
VONWULEBRAND V 
DISEASE m 




K2 


Compound 

CHAIN: L; CATALYTIC 
ANTIBODY 1E9 (HEAVY 
CHAIN); CHAIN: H; 


IG HEAVY CHAINV 
REGIONS; CHAIN: A; IG 
HEAVY CHAINV 
REGIONS; CHAIN: B; IG 
HEAVY CHAINV 
REGIONS; CHAIN: C; IG 
HEAVY CHAINV 
REGIONS; CHAIN: D; 


IG HEAVY CHAINV 
REGIONS; CHAIN: A; IG 
HEAVY CHAINV 
REGIONS; CHAIN: B; IG 
HEAVY CHAINV 
REGIONS; CHAIN: C; IG 
HEAVY CHAINV 
REGIONS; CHAIN: D; 


IMMUNOGLOBULIN NMU- 
4IGG1;CHAIN:L; 
IMMUNOGLOBULIN NMC- 
4IGGl;CHAIN:H;VON 
WTT J JRBRAND FACTOR; 

/-*tXAYKT. A. 




MMUNOGLOBULIN/VIRU 
S HEMAGGLUTININ 
IGG2AFAB FRAGMENT 
(FAB 26/9) COMPLEXED 
WITH INFLUENZA IFRG 3 
HEMAGGLUTININ HAl 
(STRAIN X47) (RESIDUES 
101-108)1FRG4 


1 

o| 
ll 


SEQFOLD 
score 


65.86 






66.17 


66.74 


PMF 
score 




0.42 


0.01 






Verify 
score 




-0.09 

i 


0.17 






Psi 
Blast 


1 


1.46-05 


1 

*-< 


le-05 


i 




00 

CN 
CN 








a 

CN 


START 
AA 


i— < 
CN 








1-H 

CN 


CHAIN 
ID 


Q 






a 


« 




Icic 


Icic 


lfhs 


ifrg 


lfvd 


SEQID 
NO: 




9 


CN 

9 


CN 


CN 
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i 

o 



C* 

I 




6 




555 

« - - 



888 




So 

CO CO 

§5 



PP 



55 



4 

oo 



55 



pp 



55 





9 



PQ 



a 



p 



li 



h 
I 

la 



9 • 

ars 

CO 



3 



3 



3 



5 



8 



3 



3 



3 



CO 

5? 




9 < S < ' 
14 tin K U< 



5 




8 



u 



NO 

m 



3 



SO 



in 
06 



00 
00 
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PC TV 




P 

go 



5! 



o 





iiil! 







s 

sr 

CO 



o 

© 



S 



3 



o 



21 



3 



i 



Is 



a 



oo 



A 

i-H 

in 



4 



o 
oo 



S 



55 



CO 

9 
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PDB annotation 

FOIXh GLYCOPROTEIN 


PELT. ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 1 
(IG)LIKE DOMAINS | 
BELONGING TO THE I-SET 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD | 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS 
BELONGING TO THE I-SET 2 
l SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTUK/UKU W 1 11 
FACTOR RECEPTOR FGF2; 
FGER2; IMMUNOGLOBULIN 
0G)LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LtK^J 
DOMAINS, B-TREFOIL FOU^ 


||g||||! 

^ y & o o « w 


ill -Sig 

|i|si|g 


Compound 

n rv 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E, F,G,H; 


FIBROBLAST GROWTH j 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 1 ; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


SEQFOLD 
score 














PMF 
score 


0.1? 


0.03 


0.00 


0.27 


0.21 


s 


Verify 
score 


0.11 


-0.09 


s 

o 


1 


-0.07 


0.12 


Psi 
Blast 


■ 5 


9 

CN 
»— i 




9 

i-H 


00 

4 


1.7e-10 




oo 

00 

oo 


cn 

t— 1 

oo 


OO 


00 
00 
00 


VO 
cn 




START 
AA 


r-4 

CO 

r- 


1 


VO 

3 


p 


cn 


1 


la 

0 


< 


W 


o 


o 


U 


u 


CQ - 


lepf 


Si 

t—t 


I— i 


1 

1— 1 


levt 


levt 


SEQID 
NO: 




cn 
5? 
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PDB annotation 


— V 

i 


PROTEIN KINASE 
PROTEIN KINASE, CELL 
CYCLE, 

PHOSPHORYLATION, 
CTATIROSPORINE. 2 CELL 


DIVISION, MITOSIS, 


j 

1 

3 
3 

Z 
n 


11 P| 

5 o P o 5 5 a 


COMPLEX ^ 
(ISOMERASE/PROTEIN n 
KINASE) FKBP12; \ 
SERINE/THREONINE- P 
PROTEIN KINASE ^ 
RECEPTOR R4; COMPLEX 11 
(ISOMERASE/PROTEIN 07 
KINASE), RECEPTOR 2 Q 
SERINE/THREONINE 
KINASE k 


§ . « 

illiii 


Compound 


(CATALYTIC SUBUNIT) 
ALPHA ISOENZYME 
MUTANT WITH SER 139 
1 APM 4 REPLACED BY 
ALA (/S139AS) COMPLEX 
WITH THE PEPTIDE 1APM 
5 INHIBITOR PKI(5-24) 


^ SO 

h 

3 oo 

|1 


|; 

Bi 


if 

11 


CY CLIN-DEPENDENT 
PROTEIN KINASE 2; 

rttTATXT. XTfTT T . 




FK506-BINDING PROTEIN; 
CHAIN: A, C, E, G; TGF-B 
SUPERFAMILY RECEPTOR 
TYPE I; CHAIN: B,D,F,H; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A, C; 
CYCLIN-DEPENDENT 
KINASE INHIBITOR; 
CHAIN: B,D; 


SEQFOLD 
score 






r- t 
«*. 
OO 

c- 


80.91 


87.11 


PMF 
score 




0.89 








Verify 
score 




-0.04 








Psi 
Blast 




I 


5 


3 

vd 


t— < 

"2 

oo 
SO 






ON 

ro 








START 
AA 


■ 


8 


58 






CHAIN 
m 










< 


PDB 
m 




laql 


laql 


o 

i-H 


oo 
15 


a. 


5 


5 


9 


9 


9 
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PDB annotation 


1 


DOMAIN, AUTOINHIBITORY 
FRAGMENT, HOMODIMER 

... 1 


PHOSPHOTRANornKAotS m 
FGFRIK, FIBROBLAST 


GROWTH FACTOR 
RECEPTOR 1; 


5 w 

i i 

2 § ^ CtS eS P 

fif g§g 

h £ pq £ S £ 


FGFR1K, FIBROBLAST 
GROWTH FACTOR 
RECEPTOR 1; 

TRANSFERASE, TYROSINE- 
PROTEIN KINASE, ATP- ^ 
BINDING, 2 #*} 
PHOSPHORYLATION, *J 
RECEPTOIL fl 


PHOSPHOTRANSFERASE M 
TOHTRTN KINASE CDK2: Wl 


TRANSFERASE, yi 
SERINE/THREONINE p 
PROTEIN KINASE, ATP- rj 
BINDING, 2 CELL CYCLE, 1/ 
CELL DIVISION, MITOSIS, * 
PHOSPHORYLATION O 


TRANSFERASE, flj 
SERINE/THREONINE flj 
PROTEIN KINASE. ATP- 


i 

! 


(CATALYTIC SUBUNIT) 
1CTP4 


III 

III 


SERINE/THREONINE- 
PROTEIN KINASE P AK- 
ALPHA; CHAIN: C,D; 


FGF RECEPTOR I; CHAIN: 
A,B; 


FGF RECEPTOR 1; CHAIN: 
A,B; 




> 


DEPENDENT KINASE 2; 
CHAIN: NULL; 




W 

j 


SEQFOLD 
score 






84.95 


82.99 


< 

c 


o . 


PMF 
score 




o 




c 
< 


D 




•t 2 

li 




-0.05 




< 






* PQ 




00 

*n 


i-H 

VO 


o ; 

? : 

CO < 


G 

JO 


in 

DO 


END 
AA 








s 


n 

to j 




START 






1 

CO 
VO 




s 


NO 
VO 


CHAIN 
m 


j 


O 




n 








3 


6 

s 


Ifgk 




lhcl 


lhcl 


!i 

CO 


r 








9 
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qe 




g 
I 

<3 




pa 



< 



[58 



00 



3 8 
> « 



e5 - 



|3 



^ ft 

u 



la 



a 2 

Ed * 

03 



8 



!8 



3 



3 



00 

m 



53 



04 
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I 

§ 



m 8 



CO ! 



O co | 



§8*8, 

25 o >* 



1 



ar 

CO 

3. 



li 



8 



CM O 

i 

I £ CO 




3£E 



CO I 



0* pj O 

04 Q. t~ * 



ig 



8*i 



} 

S<4 



u 1 




O — 
i j m rt 
fed co Q 



. 

•a goo; 
San: 



Mi 



fa 



CO 



oo 



CM 



fa 



5 



*3 



o 

4 



OO 



CO 



s 



NO 



8 



8 



li 

CO 



i 
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i 
I 

s 



ft 



Si.- 



gliftsees 



.5 



Ok 



CO ^ 



8w 

il 



POT/ 

9 



ft 



ffi [ ^ 8 £ | ^ S E I 0 £ p co pj CO M 
fe! % fc g b § S3 * fe g 



BIOS 6 



H M ^ 



Hi 



§§111111111 ilii||||§bg|sgggsi 



111! 



I 
! 

u 



9 




S2 
P *. 



> i 



« 1 



2 9 2 o 2o 



2 3i 



lllllPllllllPlillil 

g8B§8Bl8E8B§8B.l8B8i858 




CO 



fe £ 
2 



21 



3 

i- 



a 
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PL, CO 



£ p £ £ 




^ w ^ 

< 5 < S ^3 





- o o 

sag 





6 








ii 



CO 



00 
CO 

IS 



- -a 
* « 



s 

CO 



oo 
CN 



vo 



6 



oo 



oo 
cs 
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_ >«oo5Q 



y 




2 aj 5 31 






i 

CO 



§3 



la 



8a 



9 



co 



ft 



W ! 
Q I 



a 




5 



S 

© 

T 

00 



© 



3 



4 
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I 

i 




sill 

ffi Si § Pl| 






P s 

fe 8 



2 
S 



s 

© 



I 8 



On 
O 



I" 



s 

Si 

CO 



3 



CO 



•id 

s 

CP 



ft 



a* 
cr 



ON 
CO 



I 
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Table 6 



SEQ.IDNO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


1 


24 


0.978 


0.760 


2 


32 


0.995 


0.681 


3 


37 


0.979 


0.718 


4 


18 


0.925 


0.822 


5 


28 


0.939 


0.749 


6 


41 


0.989 


0.690 


7 


26 


0.960 


0.674 


8 


16 


0.973 


0.925 


9 


24 


0.978 


0.760 


10 


18 


0.887 


0.579 


11 


42 


0.977 


0.587 


12 


21 


0.966 


0.848 


13 


25 


0.993 


0.954 


14 


28 


0.909 


0.664 


16 


23 


0.913 


0.597 


17 


42 


0.978 


0.689 


18 


21 


0.930 


0.662 


19 


45 


0.985 


0.714 


20 


37 


0.992 


0.855 


21 


31 


0.947 


0.775 


22 


20 


0.979 


0.911 


24 


30 


0.924 


0.720 


25 


26 


0.974 


0.824 


26 


28 


0.982 


0.649 


28 


16 


0.912 


0.705 


29 


27 


0.957 


0.652 


30 


22 


0.968 


0.844 


31 


23 


0.952 


0.812 


32 


18 


0.932 


0.884 


33 


29 


0.991 


0.729 


34 


26 


0.939 


0.709 


35 


29 


0.961 


0.842 


36 


16 


0.951 


0.777 


37 


27 


0.983 


0.898 


38 


17 


0.991 


0.955 


39 


33 


0.977 


0.822 


40 


17 


0.989 


0.969 


41 


30 


0.936 


0.679 


42 


24 


0.993 


0.810 


44 


22 


0.990 


0.921 


54 


18 


0.925 


0.822 


56 


18 


0.981 


0.951 


60 


1 28 


0.939 


0.749 


62 


33 


0.979 


0.757 


70 


41 


0.989 


0.690 


79 


26 


0.960 


0.674 


83 


18 


0.979 


0.963 


84 


22 


0.967 


0.792 


87 


25 


0.980 


0.867 


97 


16 


0.973 


0.925 


98 


24 


0.978 


0.760 


99 


17 


0.978 


0.925 
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Table 6 



CPA m itrj~v. 


r osition of signal 
reptiae 


Maximum score 


Mean score 


ill 
1 13 


1 Q 

15 


O.OO/ 


A COA 

0.579 


lie 
1 15 


18 


A ACO 

0.952 


0.670 


120 


y| O 

42 


A fiTJ 

0.977 


0.587 


13/ 


0 1 


A C\C£L 

0.9oo 


0.848 


1 /tA 

140 


OC 

25 


A AAO 

0.993 


A ACjI 

0.954 


153 


oo 
28 


A AAA 

0.909 


0.664 


1 50 


1 o 
lo 


A rvc A 

0.954 


0.747 


1 "7/1 
1 /4 


OO. 

23 


A A 1 1 

0.913 


01597 


175 


OA 

20 


0.986 


0.936 


1 *7Q 

1 /o 


/to 
42 


a nio 

0.978 


A /Oft 

0.689 


loO 


oo 
32 


A AOA 

0.929 


0.583 


1 O/I 

184 


O 1 

21 


A AOA 

0.979 


0.941 


192 


O 1 

21 


0.930 


0.662 


OAA 

200 


A C 

45 


0.985 


0.714 


212 


37 


0.992 


0.855 


IOC 

225 


O j4 

24 


0.971 


0.882 


ooo 

228 


20 


0.979 


0.911 


237 


17 


0.982 


0.964 


O C 1 

251 


13 


0.918 


0.692 


O CO 

252 


1 o 

13 


0.918 


0.692 


256 


OA 1 

20 


0.912 


0.693 


OCT 

257 


20 


0.912 


0.693 


O^A 

260 


26 


0.974 


0.824 


o^o 

262 


18 


0.965 


0.833 


267 


25 


0.956 


0.765 


too 

288 


16 


0.912 


0.705 


289 


18 


0.896 


0.634 


290 


19 


0.966 


0.897 


O t\A 

294 


18 


0.991 


0.973 


one 

295 


20 


0.906 


0.580 


OAA 

299 


27 


0.957 


0.652 


O AO" 

307 


1 A 

19 


0.983 


0.871 


"5 1 A 

! 310 


oo 

22 


6.968 


0.844 


OOA 

320 


oo 

23 


0.952 


01812 


324 


OO 1 

27 


0.982 


0.911 


OOO 

32/ 


1 0 

lo 


A AOO 

0.983 


0.941 


32o 


t o 

18 


0.932 


0.884 


. 900 

332 


OO" 

27 


A AAA 

0.990 


0.923 


TIC 

335 


A C 

45 


0.983 


0.793 










346 


29 


0.991 


0.729 


354 


22 


0.978 


0.877 


363 


26 


0.939 


0.709 


364 


22 


0.966 


0.843 


375 


29 


0.961 


0.842 


379 


16 


0.951 


0.777 


401 


44 


0.975 


0.876 


407 


33 


0.977 


0.822 


417 


17 


0.989 


0.969 


418 


23 


0.974 


0.799 


422 


18 


0.981 


0.952 


426 


21 


0.982 


0.912 



591 



WO 02/081731 



PCT/US02/01222 



Table 6 



SEQJDNO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


428 


30 


0.936 


0.679 


429 


43 


0.978 


0.712 


433 


28 


0.993 


0.948 


434 


43 


0.930 


0.624 


437 


24 


0.993 


0.810 


438 


16 


0.978 


0.939 
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Table 7 



SEQ ID NO: 


Chromsomal location 


3 


2qll.2 


4 


20pter-pl2.3 


5 


5q31 


6 


19pl2 


7 


19pl2 


8 


5 


11 


12pl3-pl2 


12 


pi 1.2-12.3 


13 


19p 


14 


6pl2.1-21.1 


15 


19pl3.1 


17 


16ql2-ql3 


19 


15 


20 


15 


22 


Xql3.1 


23 


12 


25 


llpl55 


26 


20 


27 


22 


28 


12q23-24.1 


29 


20 


30 


13 


31 


12 


33 


15 


36 


4q28 


37 


14q24.3 


38 


10 


39 


20 


41 


17ql2-q21 


42 


14 


44 


lq24.1-25.2 


45 


2 


47 


3q21«q25 


48 


9 


49 


14 


50 


6ql4.M5 


51 


19 


52 


11 


53 


20 


54 


16 


55 


14 


56 


3 


57 


19 


58 


7pl5.1-pl3 


59 


19 


61 


2 


62 


19 


63 


16 


66 


15 


70 


lp3U-33 


71 


9 


72 


16 
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SEQ ID NO: 


Chromsomal location 


74 


5q31-q33 


75 


3p21.l-ql3.13 


76 


2 


77 


2 


78 


21q22.l 


79 


Xpll.22-pll.21 


80 


2 


81 


19 


82 


20 


83 


19pl3.3 


84 


19 


85 


3 


86 


8 


87 


lpl3 


88 


16 


89 


18q2Ll-q22 


90 


Uql3.1-ql3.3 


91 


18pll.23-pll.21 


92 


17 


93 


10 


94 


3 


95 


X 


96 


6q 14.2-16.1 


97 


lq21.2-22 


98 


lq21.2-22 


99 


6 


102 


8q22-q23 


103 


10pll.2 


104 


17 


105 


17 


106 


2 


107 


1 


108 


16 


109 


17q21.3-q22 


110 


llq | 


111 


3p21.1-ql3.13 


112 


16 


113 


5 


114 


9 


115 


3pl3-q26.1 


116 


5 


117 


7q31 


118 


14 


119 


14 


120 


19 


121 


! 19 


122 


6q27 


123 


14 


124 


Iq21-q22 


125 


6 


126 


17q25 


127 


15 
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SEQID NO: 


Chromsomal location 


129 


14q31 


130 


1 — 1 C 1 

lp3o.l 


131 


11 


132 


20 


133 


20pll.23-p 11.21 


134 


lp32 


135 


2q31 


136 


X 


138 


12pl3 


139 


9 


140 


p34.1-34.3 


141 


19ql2 


142 


15q26 


143 


22qll.21 


144 


17ql2 


145 


4pl6.3 


146 


22 


147 


16pll.2 


148 


18ql2 


150 


4 


151 


7pl2-q 11.21 


152 


14 


153 


14q32.33 


155 


lp34 


156 


16pl3.3 


157 


12pl3.3 


158 


5 


159 


8 i 


160 


19 


161 


4 


162 ] 


1 


163 


llq23 


164 


3 


165 


12q22 


168 


19 


170 


1 


171 


18ql2 


173 


7 


174 


13 


175 


2p23.3-q32.3 


176 


16 


178 


10 


179 


Iq21-q25 


180 


19pl3.3 


181 


1 


184 


lp35.1-36.23 


185 


1 


186 


18 


187 


3pl3-q26.1 


188 


3 


189 


17 


190 


6 
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SEQ ID NO: 


Chromsotnai location 


193 


llpl5.5 


194 


14q32 


195 


12 


196 


10q24 


198 


lp36.1 


199 


5q22 


200 


11 


201 


2q31 


202 


17 


206 


Xpll.23 


207 


9q34 


208 


19 


209 


20 


210 


llq23 


211 


16pl2 


212 


19ql3.1 


213 


7pl5 


214 


15 


215 


lp36.21-36.33 


216 


11 


217 


22qll.2 


218 


15 


219 


19ql3.4 


222 


19 


223 


lq25.2 


226 


1 


227 


lp36.1 1-36.23 


228 


Ip36.3-p36.13 


230 


17 


231 


7q33-q34 


232 


3 


233 


9 


234 


10 


235 


17 


236 


4 


237 


19ql3.4 


238 


4q25 


239 


2 


240 


7 


241 


12 


243 


6p21.3 


244 


3pl3-q26.1 


245 


17 


246 


lp34.1 


247 


3q23 


248 


3p21.3 


249 


20 


250 j 


20 


251 


18ql2-q21 


252 


18ql2-q21 


253 


14 


254 


Ip35.3-p35.1 
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SEQ ID NO: 


Chromsomal location 


256 


6q25-q26 


257 


6q25-q26 


258 


Iq21-q23 


259 


I6pl3.2-16pl3.11 


260 


14q21.1-q24.1 


261 


2p23.3-q32.3 


262 


12 


263 


19 


264 


4q28 


265 


2 


266 


2 


267 


Iq21-q23 


268 


20pl2.3-pl3 


269 


4 


270 


6 


271 


2p23.3-ql4.3 


272 


18q21 


273 


18q21 


274 


14q22 


275 


6p21.3 


276 


5 


280 


8 


281 


4q22-q24 


282 


2 


283 


7q22-q31.1 


284 


11 


285 


llql23 


286 


10 


287 


19 


290 


17 


291 


4q22 


292 


lp36.1 1-36.23 


293 


19 


294 


22 


296 


3 


297 


4pl6 


298 


6 


299 


8ql3 


300 


20 


301 


15 


302 


22qll.2-q22 


303 


15 


304 


6 


306 


6 


307 


9p24.2 


308 


2p23.3-q24.3 


309 


14 


310 


6 


311 


2 


312 


4 


313 


19pter-19pl3.3 


314 


3 
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CP/1 m KT/*fc. 


Chromsomal location 


olo 


1 \A O 
JL1P1Z-14.Z 




iy 


HQ 

31o 


1*7 


lift 


1*7 
1 / 


32U 


j(|14 


323 


4 


324 


3 P 


325 


opZ 1.1-21.31 


326 


17pll.2 


327 


9 


328 


5q23 


329 


2 


330 


3 


331 


lp2 1.1-22.1 


332 


9 


333 


7 


334 


llql3 


337 


14 


338 


7q35-q36 


339 


13 


340 


6qlLl-22.33 


341 


Ilql2-ql3.1 


343 


10 


*% A A 

344 


16 


345 


16 


346 


llq22 


347 


1ft 

19 


348 


15q24-q26 


350 


"V_ till 11 

Xpl 1.21-1 1.22 


354 


16 


355 


19 


356 


11 


358 


Xpl 1.23 


359 


4 


360 


8 


362 


4 


363 


11 


364 


llql3 


365 


7q31 


300 


zzqi 3.31-1J.3Z 


367 


5 


370 


19 


371 


7q31.1-7q31.33 


372 


2q37.3 


373 


3 


374 


16 


375 


19ql3.4 


376 


18ql2 


377 


18ql2 


379 


8 


380 


llql3 


381 


6 
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SEQ ID NO: 


Chromsomal location 


385 


4q28 


386 


15 


387 


10 


388 


17 


389 


llpl5.4 


390 


6p21.3 


391 


22ql3 


392 


3 


393 


19 


394 


15 


395 


1 


396 


6p21.2-p2L3 


397 


15 


399 , 


7q31 


400 


14 


| 402 


Xq28 


I 403 


10 


404 


16 


406 


16 


408 


11 


412 


20ql2-13.1 


413 


15 


414 


17 


415 


4 


416 


12q 


! 419 


21q22.1 


420 


16pll.2 


422 


6 


424 


21 


426 


14 


428 


14 


429 


Iq22-q23 


430 


Uql3 


431 


3 


432 


2 


433 


19ql3.1 


434 


20ql3.1 


435 


18q23 


436 


llq24 


437 


10 


438 


4q21-q25 
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SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


52 


52 


54 


53 


53 


55 


54 


54 


56 


55 


55 


57 


56 


56 


58 


57 


57 


59 


58 


58 


60 


59 


59 


61 


60 


60 


62 


61 


61 


63 


62 


62 


64 


63 


63 


65 


64 


64 


66 


65 


65 


67 


66 


66 


68 


67 


67 


69 


68 


68 


70 


69 


69 


71 


70 


70 


72 


71 


71 


73 


72 


72 


74 


73 


73 


75 


74 


74 


76 


75 


75 


77 


76 


76 


78 


77 


77 


79 


78 


78 


80 


79 


79 


81 


80 


80 


82 


81 


81 


83 


82 


82 


84 


83 


83 


85 


84 


84 


86 


85 


85 


87 


86 


86 


88 


87 


87 


89 


88 


88 


.90 


89 


89 


91 


90 


90 


92 


91 


91 


93 


92 


92 


94 


93 


93 


95 


94 


94 


96 


95 


95 


97 


96 


96 


98 


97 


97 . 


99 


98 


98 


100 


99 


99 


101 


100 


100 


102 


101 


101 


103 


102 


102 


104 


103 


103 


105 
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oci v iNv/r oi r uu-iengcn 
l^ucieoiiuc oecjUcncc 


CPA TTI AT it* *f 17.. 11 lan/r^k 

■ rNucieouae oequence 


oilq ID inu: in Priority Application 

ftocw swstnnA oo 
UoolN U5r///4,3Zo 




1 AA i 
1U4 


1 A#£ 
1U0 




1 AC 


1 AT 

10/ 


1 A£ 


lUO 


1 AO 

108 


1 A*7 


10/ 


1 AA 


1 AO 

lUo 


1 AO 

108 


1 1 A 

no 


1 Art 

109 


1 Art 

109 


111 


1 1 A 
110 


1 1 A 

110 


112 


111 


111 


113 


112 


112 


114 


1 113 


113 


115 


1 1 A 

1 14 


114 


116 


115 


115 


117 


116 


116 


118 


117 


117 


119 


tie 

118 i 


118 


120 


i i rt 

119 


119 


121 


i *%rt 

120 


120 


122 


121 


121 


123 


1 OI 

122 


122 


124 


1 AO 

123 


123 


125 


124 


1 *% A 

124 


126 


IOC 

125 


125 


127 


126 


126 


128 


127 


127 


129 


i *%o 

128 


128 


130 


i *%A 
129 


129 


131 


1 ** A 

130 


130 


132 


131 


131 


133 


132 


132 


134 


133 


133 


135 


1 *» A 

134 


134 


136 


135 


135 


137 


136 


136 


138 


137 


137 


139 


138 


138 


140 


1 in 

139 


139 


141 


1 A A 

140 


i a r\ 

140 


142 


141 


1 A 1 

I4l 


143 


142 


142 


144 




IS- j 


i*o 


144 


144 


146 


145 


145 


147 


146 


146 


148 


147 


147 


149 


L 148 


148 


150 


149 


149 


151 


150 


150 


152 


151 


151 


153 


152 


152 


154 


153 


153 


: 155 


154 


154 


156 


155 


155 


157 
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SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


1 C£. 

156 


156 


158 


157 


157 


159 


158 


158 


160 


159 


159 


161 


160 


160 


162 


161 


161 


163 


162 


162 


164 


163 


163 


165 


164 


164 


166 


165 


165 


167 


166 


166 


168 


167 


167 


169 


168 


168 


170 


169 


169 


171 


170 


170 


172 


171 


171 


173 


172 


172 


174 


173 


173 


175 


174 


174 


176 


175 


175 


177 


176 


176 


178 


177 


177 


179 


178 


178 


180 


179 


179 


181 


180 


180 


182 


181 


181 


183 


182 


182 


184 


183 


183 


185 


184 


184 


186 


185 


185 


187 


186 


186 


188 


187 


187 


189 


188 


188 


190 


189 


189 


191 


190 


190 


192 


191 


191 


193 


192 


192 


194 


193 


193 


195 


194 


194 


196 


195 


I 195 


197 


196 


196 


198 


197 


197 


199 


198 


198 


200 


199 


199 


201 


200 


200 


202 


201 


201 


203 


202 


202 


204 


203 


203 


205 


204 


204 


206 


205 


205 


207 


206 


206 


208 


207 


207 


209 
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SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 






210 


Z\)y 


209 


211 


2IU 


i 1 1\ 
210 


i f i 
212 


211 


it i 
211 


213 


ill 

212 


212 i 


214 


213 


213 


215 


214 


1 1 A 

214 


216 


215 


215 


217 


216 


216 


218 


217 


217 


219 


218 


218 


220 


219 


219 


221 


220 


220 


222 


221 


221 


223 


222 


222 


224 


223 


223 


225 


224 


224 


226 


225 


225 


227 


226 


226 


228 


227 


227 


229 


228 


228 


230 


229 


229 


231 


230 


230 


232 


* 231 


231 


L 233 


232 


232 


234 


233 


233 


235 


234 


234 


236 


235 


235 


237 


236 


236 


238 


237 


237 


239 


238 


238 


240 


239 


239 


241 


240 


240 


242 


241 


241 


243 


242 


242 


244 


243 


243 


245 


. 244 


244 


246 


245 


245 


247 


246 


246 


248 








248 


248 


250 


249 


249 


251 


250 


250 


252 


251 


251 


253 


252 


252 


254 




253 


255 


254 


254 


256 


255 


255 


257 


! 256 


256 


258 


257 


257 


259 


258 


258 


260 


259 


259 


261 
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ol!#Vf iu inu: oi jfuii-iengtn 
iNucieonue oequence 


oJCiVj ixi inu: oi r mi-iengtn 

ftj It a1 AAftll A OaAiSAWVAA 

lNucieoiiae oequence 


ojl>|£ JLU rixji in trnoniy Application 

TTQQK nO/774 KJH 
uooiN uy/ / /H5OZ0 


9 fin 
zou 


zou 


969 
zuz 


Z01 


zoi 


ZOJ 


9fi9 
ZOZ 


9fi9 
zoz 


9fi4 

ZU*fr 


ZOJ 


9^1 
zoj 


ZOJ 


ZOH 


ZO*T 


zoo 


9fi«. 
Z03 


9^5 
ZOj 


9fi7 
zO / 


zoo 


9££ 
ZOO 


zOo 


9A7 ! 
20/ 


9£7 
ZO/ 


9£0 

zoy 


KO 

zoo 


zoo 


77A 
z/U 


7ZO ! 

2o9 ; 


izo 
269 


z/1 


270 


270 


777 

z/2 


271 


1*71 

271 


1*7*1 
273 


111 

272 


272 


2/4 


273 


273 


275 


274 


274 


276 


275 


275 


277 


276 


276 


278 


277 


277 


279 


278 


278 


280 


279 


279 


281 


280 


280 


101 

282 


281 


281 


101 

283 


282 


282 


284 


283 


283 


IOC 

285 


lO A 

284 


284 


286 


285 


IOC 

285 


lOI 

287 


ZOO 


lOiC 

zoo 


70Q 
ZOO 


28/ 


zo/ 


ICQ 
Z09 


OOO 
ZOO 


100 

zoo 


7 on 
29U 


7QQ 


Z09 


701 

Zy 1 


ion 


ion 
Z9U 


7G7 

zyz 


701 

291 


lOt 

291 


707 

zyj 


292 


107 

z9z 


7A/f 

294 


293 


101 

293 


7A< 

295 


294 


294 


7Q£ 

290 


295 


one 
295 


107 

29/ 


iaz 
290 


1AZ 

296 


700 

29o 


inn 

297 


1A1 

297 


ion 
299 


1 AO 

298 


iao 
298 


1AA 

300 


zyy 


9QQ 
zyy 


OKI l 


300 


300 


302 


301 


301 


303 


302 


302 


304 


303 


303 


305 


304 


304 


306 


305 


305 


307 


306 


306 


308 


307 


307 


309 


308 


308 


310 


309 


309 


311 


310 


310 


312 


311 


311 


313 
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SEQ ID NO: of Full-length 
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SEQ ED NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


312 


312 


314 


313 


313 


315 


314 


314 


316 


315 


315 


317 


316 


316 


318 


317 


317 


319 


318 


318 


320 


319 


319 


321 


320 


320 


322 


321 


321 


323 


322 


322 


324 


323 


323 


325 


324 


324 


326 


325 


325 


327 


326 


326 


328 


327 


327 


329 


328 


328 


330 


329 


329 


331 


330 


330 


332 


331 


331 


333 


332 


332 


334 


333 


333 


335 


334 


334 


336 


335 


335 


337 


336 


336 


338 


337 


337 


339 


338 


338 


340 


339 


339 


341 


340 


340 


342 


341 


341 


343 


342 


342 


344 


343 


343 


345 


344 


344 


346 


345 


345 


347 


346 


346 


348 


347 


347 


349 


348 


348 


350 


349 


349 


351 


350 


350 


352 


'yet 

351 


351 


353 


352 


352 


354 


353 


353 


355 


354 


354 


356 


355 


355 


357 


356 


356 


358 


357 


357 


360 


358 


358 


361 


^ 359 


359 


362 


360 


360 


363 


361 


361 


364 


362 


362 


365 


363 


363 


366 
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SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
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364 


364 


367 


365 


365 


368 


366 


366 


369 


367 


367 


370 


368 


368 


371 


369 


369 


372 


370 


370 


373 


371 


371 


374 


372 


372 


375 


373 


373 


376 


374 


374 


377 


375 


375 


378 


376 


376 


379 


377 


377 


380 


378 


378 


381 


379 


379 


382 


380 


380 


383 


381 


381 


384 


382 


382 


385 


383 


383 


386 


384 


384 


387 


385 


385 


388 


386 


386 


389 


387 


387 


390 


388 


388 


391 


389 


389 


392 


390 


390 


393 


391 


391 


394 


392 


392 


395 


393 


393 


396 


394 


394 


397 


395 


395 


398 


396 


396 


399 


397 


397 


400 


398 


398 


401 


399 


399 


402 


400 


400 


403 


401 


401 


404 


402 


402 


405 


403 


f 403 


406 


404 


404 


407 


405 


405 


408 


406 


406 


409 


407 


407 


410 


408 


408 


411 


409 


409 


412 


410 


410 


413 


411 


411 


414 


412 


412 


415 


413 


413 


416 


414 


414 


417 


415 


415 


418 
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SEQ D> NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: of Full-length 
Nucleotide Sequence 


SEQ ID NO: in Priority Application 
USSN 09/774,528 


416 


416 


419 1 


417 


417 


420 


418 


418 


421 


419 


419 


422 


420 


420 


423 


421 


421 


424 


422 


422 


425 


423 


423 


426 


424 


424 


427 


425 


425 


428 


426 


426 


429 


427 


427 


430 


428 


428 


431 


429 


429 


432 


430 


430 


433 


431 


431 


434 


432 


432 


435 


433 


433 


436 


434 


434 


437 


435 


435 


438 


436 


436 


439 


437 


437 


440 


438 


438 


441 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 
1-438, an active domain coding portion of SEQ ID NO: 1-438, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5 . An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1. 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of 
claim 1; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under 
stringent conditions with any one of SEQ ID NO: 1-438. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: . 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide, of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 1-438, an 
active domain coding portion of SEQ ID NO: 1-438, complementary sequences thereof 
and a polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1- 
438, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides encoded by SEQ ED NO: 1-438, the 
mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-438. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects foil-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mamnuilian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutical^ acceptable carrier. 
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The special technical feature of Group V is a a method of detecting the polypeptide of Group II. 

The special technical feature of Group VI is a a method of identifying a compound that bind to die polypeptide of Group H. 

The special technical feature of Group VII is a a method of treatment using the polypeptide of Group n. 
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inventive concept. 
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